Sistema Linux: construcción del entorno de procesamiento del lenguaje natural (PNL) [instalación e implementación de un sistema de clasificación de texto inteligente]

1. Entorno de instalación

1. Instale el entorno informático científico de Anconda

Entorno de computación científica de Anconda, que incluye paquetes de computación científica como python3, pip, pandas y numpy.

Descargar Anaconda3-5.2.0-Linux-x86_64.sh
curl -O https://repo.anaconda.com/archive/Anaconda3-5.2.0-Linux-x86_64.sh

[root@ainlp ~]# curl -O https://repo.anaconda.com/archive/Anaconda3-5.2.0-Linux-x86_64.sh
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100  621M  100  621M    0     0  1785k      0  0:05:56  0:05:56 --:--:-- 2241k
[root@ainlp ~]# 

Instale Anaconda3-5.2.0-Linux-x86_64.sh
sh Anaconda3-5.2.0-Linux-x86_64.sh

[root@ainlp ~]# sh Anaconda3-5.2.0-Linux-x86_64.sh

Configurar ~ / .bashrc,

[root@centos608 ~]# ll -la

Agrega una línea:

export PATH=/root/anaconda/bin/:$PATH

El archivo .bashrc modificado

# .bashrc

# User specific aliases and functions

alias rm='rm -i'
alias cp='cp -i'
alias mv='mv -i'

# Source global definitions
if [ -f /etc/bashrc ]; then
        . /etc/bashrc
fi

# added by Anaconda3 installer
export PATH="/root/anaconda3/bin:$PATH"

2. Instale el supervisor de componentes necesario, nginx

yum install supervisor -y
yum install nginx -y

3. Instalar pip

wget https://bootstrap.pypa.io/2.7/get-pip.py
python get-pip.py

wget https://bootstrap.pypa.io/2.7/get-pip.py
python get-pip.py

4. Instale herramientas de compilación

yum install -y gcc* pcre-devel openssl-devel

...
Dependency Updated:
  e2fsprogs.x86_64 0:1.42.9-19.el7        e2fsprogs-libs.x86_64 0:1.42.9-19.el7    glibc.x86_64 0:2.17-323.el7_9    glibc-common.x86_64 0:2.17-323.el7_9    krb5-libs.x86_64 0:1.15.1-50.el7    krb5-workstation.x86_64 0:1.15.1-50.el7   
  libcom_err.x86_64 0:1.42.9-19.el7       libgcc.x86_64 0:4.8.5-44.el7             libgomp.x86_64 0:4.8.5-44.el7    libkadm5.x86_64 0:1.15.1-50.el7         libselinux.x86_64 0:2.5-15.el7      libselinux-python.x86_64 0:2.5-15.el7     
  libselinux-utils.x86_64 0:2.5-15.el7    libsepol.x86_64 0:2.5-10.el7             libss.x86_64 0:1.42.9-19.el7     libstdc++.x86_64 0:4.8.5-44.el7         openssl.x86_64 1:1.0.2k-21.el7_9    openssl-libs.x86_64 1:1.0.2k-21.el7_9     
  zlib.x86_64 0:1.2.7-19.el7_9           

Complete!
[root@ainlp django-uwsgi]# 

5. Instale las dependencias de Python

yum install -y python-devel

[root@ainlp django-uwsgi]# yum install -y python-devel
Loaded plugins: fastestmirror, langpacks
Loading mirror speeds from cached hostfile
 * base: mirrors.aliyun.com
 * epel: mirror.sjtu.edu.cn
 * extras: mirrors.aliyun.com
 * updates: mirrors.aliyun.com
Package python-devel-2.7.5-90.el7.x86_64 already installed and latest version
Nothing to do
[root@ainlp django-uwsgi]# 

6. Instale el kit de herramientas de Python requerido por el proyecto, uwsgi, tensorflow, keras, django, etc., usamos requirements.txt para instalar juntos.

cd /data/django-uwsgi/
pip install -r requirements.txt

El archivo requirements.txt incluye:

## The following requirements were added by pip freeze:
neo4j-driver
pandas>=0.20.3
numpy>=1.13.1
jieba>=0.39
Django>=1.11.7
djangorestframework>=3.7.3
django-filter>=1.1.0
flower>=0.9.2
uwsgi>=2.0.15
requests>=2.18.4
django-cors-headers
tensorflow==1.14.0
keras==2.2.4
celery>=3.1.25 

Compruebe si Django y la versión instalada ya están instalados. Si esta línea de comando da como resultado un número de versión, prueba que ha instalado Django y muestra la versión instalada actualmente; si recibe un mensaje de error "No hay módulo llamado django", significa que aún no lo ha instalado.

$ python -m django --version

Crea tu propio proyecto de django. Abra la línea de comando, cd a un directorio donde desee poner su código, como cd / User / tester / myonesite, y luego ejecute el comando. Esta línea de código creará un directorio de proyecto llamado mysite en el directorio actual

$ django-admin startproject mysite

Para verificar si el proyecto está creado, primero cambie al directorio de su proyecto cd / User / tester / myonesite / mysite para ejecutar

$ python manage.py runserver

Ver el siguiente resultado indica que el proyecto se creó correctamente

Performing system checks...

System check identified no issues (0 silenced).

You have unapplied migrations; your app may not work properly until they are applied.

Run 'python manage.py migrate' to apply them.

八月 08, 2018 - 15:50:53

Django version 2.0, using settings 'mysite.settings'

Starting development server at http://127.0.0.1:8000/

Quit the server with CONTROL-C.

Por defecto, el comando runserver configurará el servidor para monitorear el puerto 8000 de la IP interna de la máquina. Si desea cambiar a otro puerto 8080, use el comando

$ python manage.py runserver 8082

Crear aplicaciones En Django, cada aplicación es un paquete de Python y sigue las mismas convenciones. Django viene con una herramienta que puede ayudarte a generar la estructura de directorios básica de tu aplicación, para que puedas concentrarte en escribir código en lugar de crear directorios. Utilice el comando para crear un inicio de sesión de la aplicación en mysite

$ python manage.py startapp login

Genere archivos de migración para cambios de modelo

$ python manage.py makemigrations 

Migración de la base de datos de la aplicación

$ python manage.py migrate 

Cree una cuenta de administrador en segundo plano de django, ingrese el nombre de administrador que desea crear, ingrese la dirección de correo electrónico, la contraseña y la segunda contraseña de confirmación que desea usar.

python manage.py createsuperuser

Aviso: superusuario creado correctamente. Indica que la creación se ha realizado correctamente.

Modificar la contraseña de administrador

$ manage.py changepassword admin

7, base de datos de mapas de instalación neo4j

Para que el sistema CentOS instale Neo4j, debe instalar manualmente la fuente Yum

cd /tmp
wget http://debian.neo4j.org/neotechnology.gpg.key
sudo rpm --import neotechnology.gpg.key
  • Donde cd / tmp es navegar al directorio tmp del sistema;
  • Luego use el comando wget para descargar el archivo de configuración de instalación neotechnology.gpg.key al directorio actual;
  • Luego use el comando sudo rpm --import neotechnology.gpg.key para importar el archivo de configuración de instalación al sistema.

A continuación, el editor de texto crea un contenido /etc/yum.repos.d/neo4j.repo:

[neo4j] 
name=Neo4j RPM Repository
baseurl=http://yum.neo4j.org/stable
enabled=1
gpgcheck=1

Finalmente, podemos instalar neo4j usando el comando yum.

yum install neo4j-3.3.5

Hasta ahora, Neo4j se ha instalado bajo el sistema CentOS. La siguiente es la ruta del archivo de Neo4j después de la instalación:

  1. El directorio de instalación de Neo4j es: / usr / share / neo4j
  2. El directorio donde se encuentra el archivo de propiedades de Neo4j es: / etc / neo4j
  3. El directorio de almacenamiento de archivos de base de datos predeterminado de Neo4j es: / var / lib / neo4j

Navegamos al directorio en ejecución / usr / share / neo4j / bin y ejecutamos: comando de inicio neo4j para iniciar la base de datos neo4j.

Copie su archivo de configuración a

# 使用自己的配置文件
cp /data/django-uwsgi/util/neo4j.conf /etc/neo4j/neo4j.conf

El archivo de configuración neo4j.conf es:

#*****************************************************************
# Neo4j configuration
#
# For more details and a complete list of settings, please see
# https://neo4j.com/docs/operations-manual/current/reference/configuration-settings/
#*****************************************************************

# The name of the database to mount
#dbms.active_database=graph.db

# Paths of directories in the installation.
dbms.directories.data=/var/neo4j/db
dbms.directories.plugins=/var/lib/neo4j/plugins
dbms.directories.certificates=/var/lib/neo4j/certificates
dbms.directories.logs=/var/log/neo4j/
dbms.directories.lib=/usr/share/neo4j/lib
dbms.directories.run=/var/run/neo4j

# This setting constrains all `LOAD CSV` import files to be under the `import` directory. Remove or comment it out to
# allow files to be loaded from anywhere in the filesystem; this introduces possible security problems. See the
# `LOAD CSV` section of the manual for details.
dbms.directories.import=/var/neo4j/import

# Whether requests to Neo4j are authenticated.
# To disable authentication, uncomment this line
#dbms.security.auth_enabled=false

# Enable this to be able to upgrade a store from an older version.
#dbms.allow_upgrade=true

# Java Heap Size: by default the Java heap size is dynamically
# calculated based on available system resources.
# Uncomment these lines to set specific initial and maximum
# heap size.
dbms.memory.heap.initial_size=512m
#dbms.memory.heap.max_size=10g

# The amount of memory to use for mapping the store files, in bytes (or
# kilobytes with the 'k' suffix, megabytes with 'm' and gigabytes with 'g').
# If Neo4j is running on a dedicated server, then it is generally recommended
# to leave about 2-4 gigabytes for the operating system, give the JVM enough
# heap to hold all your transaction state and query context, and then leave the
# rest for the page cache.
# The default page cache memory assumes the machine is dedicated to running
# Neo4j, and is heuristically set to 50% of RAM minus the max Java heap size.
#dbms.memory.pagecache.size=10g

#*****************************************************************
# Network connector configuration
#*****************************************************************

# With default configuration Neo4j only accepts local connections.
# To accept non-local connections, uncomment this line:
dbms.connectors.default_listen_address=0.0.0.0

# You can also choose a specific network interface, and configure a non-default
# port for each connector, by setting their individual listen_address.

# The address at which this server can be reached by its clients. This may be the server's IP address or DNS name, or
# it may be the address of a reverse proxy which sits in front of the server. This setting may be overridden for
# individual connectors below.
dbms.connectors.default_advertised_address=0.0.0.0

# You can also choose a specific advertised hostname or IP address, and
# configure an advertised port for each connector, by setting their
# individual advertised_address.

# Bolt connector
dbms.connector.bolt.enabled=true
dbms.connector.bolt.tls_level=OPTIONAL
dbms.connector.bolt.listen_address=0.0.0.0:7687

# HTTP Connector. There must be exactly one HTTP connector.
dbms.connector.http.enabled=true
dbms.connector.http.listen_address=0.0.0.0:7474

# HTTPS Connector. There can be zero or one HTTPS connectors.
#dbms.connector.https.enabled=true
#dbms.connector.https.listen_address=:7473

# Number of Neo4j worker threads.
#dbms.threads.worker_count=

#*****************************************************************
# SSL system configuration
#*****************************************************************

# Names of the SSL policies to be used for the respective components.

# The legacy policy is a special policy which is not defined in
# the policy configuration section, but rather derives from
# dbms.directories.certificates and associated files
# (by default: neo4j.key and neo4j.cert). Its use will be deprecated.

# The policies to be used for connectors.
#
# N.B: Note that a connector must be configured to support/require
#      SSL/TLS for the policy to actually be utilized.
#
# see: dbms.connector.*.tls_level

#bolt.ssl_policy=legacy
#https.ssl_policy=legacy

#*****************************************************************
# SSL policy configuration
#*****************************************************************

# Each policy is configured under a separate namespace, e.g.
#    dbms.ssl.policy.<policyname>.*
#
# The example settings below are for a new policy named 'default'.

# The base directory for cryptographic objects. Each policy will by
# default look for its associated objects (keys, certificates, ...)
# under the base directory.
#
# Every such setting can be overriden using a full path to
# the respective object, but every policy will by default look
# for cryptographic objects in its base location.
#
# Mandatory setting

#dbms.ssl.policy.default.base_directory=certificates/default

# Allows the generation of a fresh private key and a self-signed
# certificate if none are found in the expected locations. It is
# recommended to turn this off again after keys have been generated.
#
# Keys should in general be generated and distributed offline
# by a trusted certificate authority (CA) and not by utilizing
# this mode.

#dbms.ssl.policy.default.allow_key_generation=false

# Enabling this makes it so that this policy ignores the contents
# of the trusted_dir and simply resorts to trusting everything.
#
# Use of this mode is discouraged. It would offer encryption but no security.

#dbms.ssl.policy.default.trust_all=false

# The private key for the default SSL policy. By default a file
# named private.key is expected under the base directory of the policy.
# It is mandatory that a key can be found or generated.

#dbms.ssl.policy.default.private_key=

# The private key for the default SSL policy. By default a file
# named public.crt is expected under the base directory of the policy.
# It is mandatory that a certificate can be found or generated.

#dbms.ssl.policy.default.public_certificate=

# The certificates of trusted parties. By default a directory named
# 'trusted' is expected under the base directory of the policy. It is
# mandatory to create the directory so that it exists, because it cannot
# be auto-created (for security purposes).
#
# To enforce client authentication client_auth must be set to 'require'!

#dbms.ssl.policy.default.trusted_dir=

# Client authentication setting. Values: none, optional, require
# The default is to require client authentication.
#
# Servers are always authenticated unless explicitly overridden
# using the trust_all setting. In a mutual authentication setup this
# should be kept at the default of require and trusted certificates
# must be installed in the trusted_dir.

#dbms.ssl.policy.default.client_auth=require

# A comma-separated list of allowed TLS versions.
# By default only TLSv1.2 is allowed.

#dbms.ssl.policy.default.tls_versions=

# A comma-separated list of allowed ciphers.
# The default ciphers are the defaults of the JVM platform.

#dbms.ssl.policy.default.ciphers=

#*****************************************************************
# Logging configuration
#*****************************************************************

# To enable HTTP logging, uncomment this line
#dbms.logs.http.enabled=true

# Number of HTTP logs to keep.
#dbms.logs.http.rotation.keep_number=5

# Size of each HTTP log that is kept.
#dbms.logs.http.rotation.size=20m

# To enable GC Logging, uncomment this line
#dbms.logs.gc.enabled=true

# GC Logging Options
# see http://docs.oracle.com/cd/E19957-01/819-0084-10/pt_tuningjava.html#wp57013 for more information.
#dbms.logs.gc.options=-XX:+PrintGCDetails -XX:+PrintGCDateStamps -XX:+PrintGCApplicationStoppedTime -XX:+PrintPromotionFailure -XX:+PrintTenuringDistribution

# Number of GC logs to keep.
#dbms.logs.gc.rotation.keep_number=5

# Size of each GC log that is kept.
#dbms.logs.gc.rotation.size=20m

# Size threshold for rotation of the debug log. If set to zero then no rotation will occur. Accepts a binary suffix "k",
# "m" or "g".
#dbms.logs.debug.rotation.size=20m

# Maximum number of history files for the internal log.
#dbms.logs.debug.rotation.keep_number=7

#*****************************************************************
# Miscellaneous configuration
#*****************************************************************

# Enable this to specify a parser other than the default one.
#cypher.default_language_version=3.0

# Determines if Cypher will allow using file URLs when loading data using
# `LOAD CSV`. Setting this value to `false` will cause Neo4j to fail `LOAD CSV`
# clauses that load data from the file system.
#dbms.security.allow_csv_import_from_file_urls=true


# Value of the Access-Control-Allow-Origin header sent over any HTTP or HTTPS
# connector. This defaults to '*', which allows broadest compatibility. Note
# that any URI provided here limits HTTP/HTTPS access to that URI only.
#dbms.security.http_access_control_allow_origin=*

# Value of the HTTP Strict-Transport-Security (HSTS) response header. This header
# tells browsers that a webpage should only be accessed using HTTPS instead of HTTP.
# It is attached to every HTTPS response. Setting is not set by default so
# 'Strict-Transport-Security' header is not sent. Value is expected to contain
# dirictives like 'max-age', 'includeSubDomains' and 'preload'.
#dbms.security.http_strict_transport_security=

# Retention policy for transaction logs needed to perform recovery and backups.
dbms.tx_log.rotation.retention_policy=1 days

# Enable a remote shell server which Neo4j Shell clients can log in to.
#dbms.shell.enabled=true
# The network interface IP the shell will listen on (use 0.0.0.0 for all interfaces).
#dbms.shell.host=127.0.0.1
# The port the shell will listen on, default is 1337.
#dbms.shell.port=1337

# Only allow read operations from this Neo4j instance. This mode still requires
# write access to the directory for lock purposes.
#dbms.read_only=false

# Comma separated list of JAX-RS packages containing JAX-RS resources, one
# package name for each mountpoint. The listed package names will be loaded
# under the mountpoints specified. Uncomment this line to mount the
# org.neo4j.examples.server.unmanaged.HelloWorldResource.java from
# neo4j-server-examples under /examples/unmanaged, resulting in a final URL of
# http://localhost:7474/examples/unmanaged/helloworld/{nodeId}
#dbms.unmanaged_extension_classes=org.neo4j.examples.server.unmanaged=/examples/unmanaged

#********************************************************************
# JVM Parameters
#********************************************************************

# G1GC generally strikes a good balance between throughput and tail
# latency, without too much tuning.
dbms.jvm.additional=-XX:+UseG1GC

# Have common exceptions keep producing stack traces, so they can be
# debugged regardless of how often logs are rotated.
dbms.jvm.additional=-XX:-OmitStackTraceInFastThrow

# Make sure that `initmemory` is not only allocated, but committed to
# the process, before starting the database. This reduces memory
# fragmentation, increasing the effectiveness of transparent huge
# pages. It also reduces the possibility of seeing performance drop
# due to heap-growing GC events, where a decrease in available page
# cache leads to an increase in mean IO response time.
# Try reducing the heap memory, if this flag degrades performance.
dbms.jvm.additional=-XX:+AlwaysPreTouch

# Trust that non-static final fields are really final.
# This allows more optimizations and improves overall performance.
# NOTE: Disable this if you use embedded mode, or have extensions or dependencies that may use reflection or
# serialization to change the value of final fields!
dbms.jvm.additional=-XX:+UnlockExperimentalVMOptions
dbms.jvm.additional=-XX:+TrustFinalNonStaticFields

# Disable explicit garbage collection, which is occasionally invoked by the JDK itself.
dbms.jvm.additional=-XX:+DisableExplicitGC

# Remote JMX monitoring, uncomment and adjust the following lines as needed. Absolute paths to jmx.access and
# jmx.password files are required.
# Also make sure to update the jmx.access and jmx.password files with appropriate permission roles and passwords,
# the shipped configuration contains only a read only role called 'monitor' with password 'Neo4j'.
# For more details, see: http://download.oracle.com/javase/8/docs/technotes/guides/management/agent.html
# On Unix based systems the jmx.password file needs to be owned by the user that will run the server,
# and have permissions set to 0600.
# For details on setting these file permissions on Windows see:
#     http://docs.oracle.com/javase/8/docs/technotes/guides/management/security-windows.html
#dbms.jvm.additional=-Dcom.sun.management.jmxremote.port=3637
#dbms.jvm.additional=-Dcom.sun.management.jmxremote.authenticate=true
#dbms.jvm.additional=-Dcom.sun.management.jmxremote.ssl=false
#dbms.jvm.additional=-Dcom.sun.management.jmxremote.password.file=/absolute/path/to/conf/jmx.password
#dbms.jvm.additional=-Dcom.sun.management.jmxremote.access.file=/absolute/path/to/conf/jmx.access

# Some systems cannot discover host name automatically, and need this line configured:
#dbms.jvm.additional=-Djava.rmi.server.hostname=$THE_NEO4J_SERVER_HOSTNAME

# Expand Diffie Hellman (DH) key size from default 1024 to 2048 for DH-RSA cipher suites used in server TLS handshakes.
# This is to protect the server from any potential passive eavesdropping.
dbms.jvm.additional=-Djdk.tls.ephemeralDHKeySize=2048

# This mitigates a DDoS vector.
dbms.jvm.additional=-Djdk.tls.rejectClientInitiatedRenegotiation=true

#********************************************************************
# Wrapper Windows NT/2000/XP Service Properties
#********************************************************************
# WARNING - Do not modify any of these properties when an application
#  using this configuration file has been installed as a service.
#  Please uninstall the service before modifying this section.  The
#  service can then be reinstalled.

# Name of the service
dbms.windows_service_name=neo4j

#********************************************************************
# Other Neo4j system properties
#********************************************************************
dbms.jvm.additional=-Dunsupported.dbms.udc.source=rpm

Inicie la base de datos del gráfico y vea el estado
neo4j start
neo4j status

[root@ainlp ~]# cd /usr/share/neo4j/bin
[root@ainlp bin]# neo4j start
Active database: graph.db
Directories in use:
  home:         /var/lib/neo4j
  config:       /etc/neo4j
  logs:         /var/log/neo4j/
  plugins:      /var/lib/neo4j/plugins
  import:       /var/neo4j/import
  data:         /var/neo4j/db
  certificates: /var/lib/neo4j/certificates
  run:          /var/run/neo4j
Starting Neo4j.
WARNING: Max 1024 open files allowed, minimum of 40000 recommended. See the Neo4j manual.
Started neo4j (pid 2701). It is available at http://0.0.0.0:7474/
There may be a short delay until the server is ready.
See /var/log/neo4j//neo4j.log for current status.
[root@ainlp ~]# neo4j status
Neo4j is running at pid 2701
[root@ainlp ~]# 

Dos, construcción de servicios de back-end

1. Inicie la base de datos del gráfico y vea el estado de la base de datos.

cd / data / django-uwsgi

Iniciar la base de datos de gráficos
neo4j start

Comprobar estado
neo4j status

[root@ainlp ~]# cd /data/django-uwsgi
[root@ainlp django-uwsgi]# neo4j start
Active database: graph.db
Directories in use:
  home:         /var/lib/neo4j
  config:       /etc/neo4j
  logs:         /var/log/neo4j/
  plugins:      /var/lib/neo4j/plugins
  import:       /var/neo4j/import
  data:         /var/neo4j/db
  certificates: /var/lib/neo4j/certificates
  run:          /var/run/neo4j
Starting Neo4j.
WARNING: Max 1024 open files allowed, minimum of 40000 recommended. See the Neo4j manual.
/usr/share/neo4j/bin/neo4j: line 410: /var/run/neo4j/neo4j.pid: No such file or directory
[root@ainlp django-uwsgi]# neo4j status
Neo4j is not running
[root@ainlp django-uwsgi]# 

2. Utilice el supervisor para iniciar el servicio principal y ver el estado del servicio.

  • Use supervisord para iniciar el servicio principal, -c significa leer un archivo de configuración personalizado
  • supervisord.conf es un archivo de configuración en el directorio principal del proyecto, que contiene el contenido de monitoreo y protección de los procesos de django y nginx
cd /data/django-uwsgi
[root@ainlp django-uwsgi]# supervisord -c supervisord.conf
[root@ainlp django-uwsgi]# 

Ver el estado de todos los procesos de monitoreo y demonio

[root@ainlp django-uwsgi]# supervisorctl status all
main_server                      BACKOFF   Exited too quickly (process log may have details)
nginx                            RUNNING   pid 3096, uptime 0:00:06
[root@ainlp django-uwsgi]# 



Referencias
La instalación de Centos7 uwsgi falla

Supongo que te gusta

Origin blog.csdn.net/u013250861/article/details/114296963
Recomendado
Clasificación