New Coronary Pneumonia CT Recognition COVID-CT (1): New Coronary Pneumonia CT Recognition Method and CT Data Set

Preface


  A few days ago, the browser suddenly pushed me an article about how researchers from the University of California, San Diego and Petuum had constructed an open source COVID-CT data set . I took a look at the code and its open source code, which is more suitable for novices like us to learn, as a practical application of the content of the previous notes, and the new crown pneumonia (coronavirus) is still a hot spot, so let's play it. In addition, this is the first time I have written notes that are relevant to my major, so let's treat it as a preface.

  Original code and data address: UCSD-AI4H/COVID-CT


CT recognition of new coronary pneumonia


  • Virus source

  The virus passes through the respiratory tract and enters our lungs. Most of our lungs are alveoli, and there is air in the alveoli. Since the attenuation of X-rays by air is very small, that is, the CT number is very small, so healthy lungs are basically black in CT images. After the new coronavirus reaches the lungs, it spreads through the alveolar pores and then attaches to the alveolar epithelium. During this time, it will cause our immune system to react. A large number of immune cells will fight the new coronavirus. These will change the physical properties of the alveoli (such as alveolar swelling, alveolar swelling). Exudation of interstitial fluid, thickening of the alveolar septum, etc.), which will reveal traces in the CT image. Therefore, we judge whether it is possible to be infected with the new coronavirus by observing the traces in the lung CT and the characteristics of the virus.

  • CT features

  Ground glass shadow

  The formation of ground-glass shadow is mainly caused by the spread of the new coronavirus along the alveolar pores after the new coronavirus invades the lungs, during which it will cause alveolar swelling, alveolar septal fluid leakage, and alveolar septal thickening. These will increase the CT number of the lungs. It turns white. There is no granulocyte exudation in viral infections, the alveoli are clean, and there is still air inside, so there is often no substantial change (white block), but a ground glass shadow (hazy), we see through the ground glass shadow The bronchus can also be seen, as shown in Figure 1 (quoted from Zhao Zhenjun, Deputy Administrative Director of the Department of Radiology, Guangdong Provincial People's Hospital, who gave a theme report entitled "Early Diagnosis and Differential Diagnosis of New Coronavirus Infection by CT" ).

Insert picture description here

Figure 1. Ground glass shadow

  Firework spread

  New coronavirus diameter: 60-140 nanometers, alveolar pore size: 10-15 microns. It can be seen that the alveolar pores are 1-2 orders of magnitude larger than the new coronavirus. Therefore, the new coronavirus spreads mainly through the alveolar pores. The multiple ground-glass shadows appear as a center spreading to the surroundings. Because they are less blocked by the lobules, the middle is connected, as shown in Figure 2 (quoted from the people of Guangdong Province Zhao Zhenjun, the administrative deputy director of the Department of Radiology of the hospital, gave a theme report entitled "Early Diagnosis and Differential Diagnosis of New Coronavirus Infection by CT" ). The bacteria are larger, mainly spread through the bronchioles, and will be distributed along the bronchi.

Insert picture description here

Figure 2. Fireworks spread

  Fine grid or striped shadow

  If there are the first two characteristics, and the area with ground glass shadow is enlarged, if you see the small grid as shown in Figure 3 (quoted from Zhao Zhenjun, deputy director of the Department of Radiology, Guangdong Provincial People’s Hospital CT Early Diagnosis and Differential Diagnosis of Infection ), it indicates that it is very likely to be a new coronavirus infection. The above are all feature recognition when people read the film, but what feature will the machine use as the basis for recognition, and perhaps there are similar features.

Insert picture description here

Figure 3. Fine mesh shadow

COVID-CT data


  • CT picture

  The code package we downloaded contains the data set. Among them, there are 349 CT images that tested positive for COVID-19 at this time, and 397 CT images tested negative for COVID-19. This is more than the statistics of the data set published by the author . It is estimated that some pictures have been added later. However, this data is still too small. Training with such a small amount of data cannot train a good model. Therefore, it is necessary to use other data to pre-train the model first, and then use transfer learning to train the COVID-19 recognition model.

Insert picture description here

Figure 4. Picture at a glance
  • Data set partition list

  The author has divided the data set into training set, test set and validation set. Different sets are formed through path allocation, and the data sets are not physically divided. Instead, according to the path in different documents, the data information is given to different data set objects during the reading process.

Insert picture description here

Figure 4. Path list file

  

Type NonCOVID-19 COVID-19 Total
train 234 191 425
val 58 60 118
test 105 98 203
Table 1. Data set division results
  • other

  There are two other files: COVID-CT-MetaInfo.xlsxand NonCOVID-CT-MetaInfo.xlsx. It contains some information about the patient corresponding to the picture, such as age. One patient corresponds to multiple pictures. We don't need this, and we NonCOVID-CT-MetaInfo.xlsxcan't open it, so we won't consider it.


Model performance


  Figure 5 shows the performance of the model given by the author in the data set introduction document. It can be seen that its accuracy rate is very high, but the recall rate is not so ideal. Finally, we do not know the author trained to use the code model to have any performance, which in the next note
discussion of it.

Insert picture description here

Figure 5. Model performance

  
  Next: New Coronary Pneumonia CT Recognition COVID-CT (2) | Deep Eyes Pytorch Check-in (8): New Coronary Pneumonia CT Image Recognition (Two Classification | Logistic Regression)


reference


  https://v.qq.com/x/page/t3068am79y7.html
  https://www.jiqizhixin.com/articles/2020-04-03
  https://github.com/UCSD-AI4H/COVID-CT

Guess you like

Origin blog.csdn.net/sinat_35907936/article/details/105673737