Firstly,I have read about this problem
I have a np.array(from a picture)
[[255 255 255 ... 255 255 255]
[255 255 0 ... 255 255 255]
[255 255 255 ... 255 255 255]
...
[255 255 0 ... 0 255 255]
[255 255 0 ... 255 255 255]
[255 255 255 ... 255 255 255]]
I want to delete the row which the amount of 0
is smaller than a specific value.
My code is:
import numpy
from collections import Counter
for i in range(pixelarray.shape[0]):
# Counter(pixelarray[i])[0] represent the amount of 0 in one row.
if Counter(pixelarray[i])[0] < 2: # check the amount of 0,if it is smaller than 2,delete it.
pixelarray = np.delete(pixelarray,i,axis=0) # delete the row
print(pixelarray)
But it raised the error:
Traceback (most recent call last):
File "E:/work/Compile/python/OCR/PictureHandling.py", line 23, in <module>
if Counter(pixelarray[i])[0] <= 1:
IndexError: index 183 is out of bounds for axis 0 with size 183
What should I do?
np.delete
is probably not the best choice for this problem. This can be solved simply by masking out the rows that do not meet the required criteria. For that, you start by counting the number of zeros per row:
zeros_per_row = (pixelarray == 0).sum(1)
This first compares each value in pixelarray
with zero, and then sums (counts the number of True
values) its columns (axis 1
), so you get the number of zeros in each row. Then, you can simply do:
rows_with_min_zeros = pixelarray[zeros_per_row >= MIN_ZEROS]
Here, zeros_per_row >= MIN_ZEROS
produces a boolean array where every value larger or equal to MIN_ZEROS
is True
. Using boolean array indexing, this can be used to exclude the rows where it is False
, that is, the rows where the number of zeros is less than MIN_ZEROS
.