Leakage of sensitive documents in the web

File leaks, according to the sensitivity of the leaked information, can be regarded as medium-risk or even high-risk vulnerabilities in WEB vulnerabilities. This article will
introduce some common leaks, which are mainly divided into leaks caused by version management software, and file inclusions. Leakage and leakage caused by misconfiguration.

Leaks caused by version management software

git

Git can be said to be the most popular version control/version management software today. Many git-based cloud hosting warehouses provide
free hosting services, and many even support free private warehouses, such as bitbucket and domestic gitosc (open source China) etc.

Key documents

When git initializes the project, it will git rev-parse --show-toplevelcreate a
.githidden folder named in the root directory of the project (available for viewing) , which contains the history of all local commits. If you accidentally put this directory under the path of the WEB, let the user Can be accessed,
then almost all the source code and other sensitive information are leaked.

Leaked content

  • All source code of the project
  • The address of the private warehouse
  • Private configuration information
  • All commuter email account information
  • (Possibly) internal account and password
  • ...

How to use

The conventional method of use is to download the entire directory, and then use the gitcommand to roll back the entire project:

wget -r --no-parent --mirror http://www.example.com/.git

cd www.example.com && git reset --hard

Of course, there are also some automated scripts:

  • dvcs-ripper : Perl-based tool, in addition to git also supports many version management systems
  • GitHack

Repair suggestions

Generally, modern web frameworks based on MVC will not directly mount files, but if it is a project based on PHP, ASP and other languages, there will still be security risks.

Although you can deny access to the .git path by configuring the WEB server (apache/nginx, etc.), there is a risk of being bypassed accidentally.

The best way is to create a www directory in the project to store the source code files.

hg/Mercurial

Mercurial means mercury, so it is abbreviated as hg (mercury), and it is also a version management software. The usage is similar to git, but it also retains the concise features of the svn command,
and natively supports the three major platforms of Windows/MacOS/Linux, unlike Git requires MinGW to run, so many people today prefer to use hg for version control.
There are some discussions about them, such as why hg is used ,
why hg is chosen instead of git, etc. I think it is also worth understanding.

Key documents

Similar to git, when hg initializes the project, it will create a .hghidden folder named in the root directory of the project ,
which contains the modification records of the code and branch and relevant information of the developer.

Leaked content

  • The source code of the project
  • Project warehouse address
  • (Possibly) username of the warehouse
  • other

How to use

Manual use, download + rollback:

wget -r --no-parent --mirror http://www.example.com/.hg

cd www.example.com && hg revert

You can also use the dvcs-ripper tool mentioned above to use

Repair suggestions

Same as git

svn/Subversion

svn , or Subversion, was once a hot version management tool before github. Although it has been declining, it
is still the main tool for version management in many state-owned enterprises, research institutes and other places. For some projects with a long history, such as LLVM, the release For historical reasons,
svn is mainly used to manage source code.

Key documents

svn will also create a .svnhidden folder named in the project root directory , which contains all branch commit information and code records.

Leaked content

  • All source code of the project
  • svn warehouse address
  • The username of the user that the svn warehouse belongs to
  • ...

How to use

The same is to download the directory first, and then roll back:

wget -r --no-parent --mirror http://www.example.com/.svn

cd www.example.com && svn revert --recursive .

Tools & scripts:

  • dvcs-ripper : supports old and new versions of svn
  • Seay-Svn : Master's tool, based on Windows platform

Repair suggestions

Same as git

bzr/Bazaar

Bzr is also a version control tool. Although it is not very popular, it is also multi-platform support and has a good graphical interface.
Therefore, some people think that bzr is better than git ,
but for penetration testers, it doesn't really matter. Up.

Key documents

When bzr initializes the project (bzr init/init-repo), it will generate .bzra hidden directory named in the project root directory, which also exposes the source code and user information.

Leaked content

  • Source code
  • Warehouse Address
  • Developer information
  • ...

How to use

I haven't used the bzr tool, but I found out that I can use the bzr revertcommand to roll back by querying the document :

wget -r --no-parent --mirror http://www.example.com/.bzr

cd www.example.com && bzr revert

Of course, the dvcs-ripper tool is also possible.

Repair suggestions

Same as git

cvs

CVS is a relatively old version control system, through which you can track the historical change records of the source code.
But because the functions are relatively simple, and does not support branching, it was replaced by the svn mentioned above long ago.

Key documents

When the cvs project is initialized (cvs checkout  project), a projectdirectory named CVSis created in the directory,
which saves the modification and commit records of each file. Through this directory, you can get the historical version of the code. The two key files are:
CVS/RootAnd CVS/Entries, respectively record the root information of the project and the structure of all files

Leaked content

Because it is a pure client tool, only the source code will be leaked

How to use

Download the CVS folder and use the cvs command to obtain the source code information, but it seems that there is no direct rollback operation, and some additional processing is required.

wget -r --no-parent --mirror http://www.example.com/CVS
cd www.example.com && cvs diff *

Or use the tool dvcs-ripper directly

Repair suggestions

If you are still using CVS, maybe you are still using perl to write cgi? ...

other

There are many version management tools, in addition to the ones mentioned above, there are also well-known ones such as BitKeeper , which are rarely used now,
but occasionally they will still bomb corpses in CTF competitions.

Leakage caused by file inclusion

In addition to the leakage caused by the above version management tools, improper configuration is also one of the important reasons for information leakage.

.DS_StoreFile leak

.DS_Store(Desktop Services Store) is a hidden file in the macOS directory. It contains the current directory structure and some custom information,
such as background and icon location. Similar files under windows are desktop.ini. Exposing the .DS_Storefile is equivalent to exposing the All the contents under the directory
can be said to be a serious leak.

How to use

.DS_StoreThe format is binary, and the internal data structure is Proprietary format.
You can parse and download all files recursively. Refer to lijiejie ds_store_exp.

Repair suggestions

Students who use macOS development can add it .DS_Storeto the ignore list (such as .gitignore), but in essence it only leaks the directory structure, even if it is deleted .DS_Store, the
file still exists in the place that the web server can access, so the permanent solution is not Put sensitive information in the web path.

WEB-INF leaked

In the Java Servlet document ,
it is said that the WEB-INFdirectory "contains resources that are used by all web applications but are not in the web path", that is, the content in the WEB-INF directory does not belong to public pages.
Web applications can getResourceThese resources are accessed in the context of the servlet by waiting for the API.

Usually developers will put many JSP files, Jar packages, and Java class files in this directory. The contents of the general directory are predictable:

WEB-INF/web.xml: Web application configuration file, which describes the configuration and naming rules of servlets and other application components.

WEB-INF/database.properties: database configuration file

WEB-INF/classes/: Generally used to store Java class files (.class)

WEB-INF/lib/: used to store the packaged library (.jar)

WEB-INF/src/: used to store source code (.asp and .php etc.)

How to use

Use the web.xml file to guess the name of the related class of the application component, and then search for the code in the src directory. If there is no source code, you can directly download the class file and decompile it.

Repair suggestions

Before publishing, confirm that the WEB-INF directory is forbidden to access, or set the filter rules on the server.

Leaked backup files

The backup file leakage is divided into two situations. One is that the operation and maintenance personnel lazily directly back up the website with similar tar -czvf bakup.tgz *commands in the root directory of the website,
so that the source code of the entire site can be directly packaged and downloaded by the user; the other is The editor used by developers or operation and maintenance personnel automatically backs up the edited web page content when modifying files,
such as vim .swp, thereby leaking the source code of the web page.

How to use

For packaged files, penetration testers can scan the website with {common file name}+{common compressed package suffix}, and there may be unexpected surprises.
For temporary backup files of webpages, you can scan the corresponding page's .swp or With suffixes such as .bak, you may find useful information.

Repair suggestions

Do a good job of version management, and use version management tools to filter out these types of files, and don't modify or add files directly in the production environment.

Leaked configuration files

Modern WEB development often does not reinvent the wheel, but is configured based on a mature framework. If the penetration tester knows what type of framework the website is based on,
it is possible to obtain the path of important configuration files through the framework's documentation, if it is open source Framework, you can also get the source code, so the seriousness of the configuration file leak is self-evident.

How to use

Know the framework type by identifying the fingerprint of the website, and then manually test whether important configuration files are available. If it is a batch test, you can prepare
common configuration file paths in advance , such as wordpress/wp-config.php, etc., and organize them into The dictionary is then tested in batches with scripts. You can refer to Pigman's dictionary .

Repair suggestions

Modify the default path of the configuration file, and block access to these paths on the server side.

Leaks caused by configuration errors

Windows IIS / Apache directory traversal

The principle of the directory traversal vulnerability is relatively simple. The program does not fully filter the user input../ and other directory jump characters, causing malicious users to access the upper level of the web root directory and traverse any files on the server.
Although the web server itself It will prohibit access to places other than the web folder, but if it is a dynamic page introduced by mentally retarded development, and user input is not well filtered, traversal or even directory traversal may occur.
Even the web server itself has similar vulnerabilities, such as Apache Tomcat For specific use and bypass of UTF-8 parsing vulnerabilities, please refer to other online articles , which will not be expanded here due to space limitations.

Nginx configuration security

There are so many configuration options for Nginx that not everyone can be familiar with, but it doesn't mean that Baidu can just copy and paste it. It is best to look at the function and usage of the corresponding options in the official documentation to
avoid many fatal errors. For example. When Nginx is proxying static files, if you accidentally write a character wrong in the configuration file:

location /static {
    alias /home/web/static/;
}

This will lead to access http://example.com/static../to the upper-level directory when you visit, and thus access to sensitive information. This article
on the security of nginx configuration and the  parting song is actually very well written, and it is worthy of every developer and operation and maintenance personnel to understand it carefully.

postscript

Leakage of sensitive information occurs from time to time, and usually causes unpredictable harm. This article discusses some examples of file leakage, which can be said to be a subset of information leakage.
File leakage is largely caused by people's carelessness, so the most A good prevention method is to standardize the development and deployment process and minimize the errors introduced by human operations. To
quote the pig man: "The opponents we face are all masters of information mining and resource integration. They only need to win once, and we will always Lost."

Reference article:

Guess you like

Origin blog.csdn.net/zhangge3663/article/details/108144913