ElasticSearch (3) jd actual combat

ElasticSearch (3) Practical operation

1. Project results

[External link image transfer failed, the source site may have an anti-leeching mechanism, it is recommended to save the image and upload it directly (img-2wxSyZ6U-1675779830034)(img/image-20230207204241171.png)]

[External link picture transfer failed, the source site may have an anti-leeching mechanism, it is recommended to save the picture and upload it directly (img-UTS1th4y-1675779830036)(img/image-20230207204307292.png)]

[External link picture transfer failed, the source site may have an anti-theft link mechanism, it is recommended to save the picture and upload it directly (img-XNJaZyg8-1675779830037)(img/image-20230207204931635.png)]

Crawl the title, price, and img from the webpage, store them in es, and search and render them with vue.

2. Create a project

1. Create a SpringBoot project

[External link picture transfer failed, the source site may have an anti-theft link mechanism, it is recommended to save the picture and upload it directly (img-dxHYgFc3-1675779830037)(img/image-20230207095610840.png)]

Modify the es version to our own corresponding version number:

<elasticsearch.version>7.6.1</elasticsearch.version>

Add in the application.properties configuration file:

server.port=8888
spring.thymeleaf.cache=false

Download the front end and deploy it to our project resources,

My gitee warehouse address: https://gitee.com/he-ze__-an/elastic-search

Add dependencies:

<!--解析网页jsoup-->
<dependency>
    <groupId>org.jsoup</groupId>
    <artifactId>jsoup</artifactId>
    <version>1.13.1</version>
</dependency>
<dependency>
    <groupId>com.alibaba</groupId>
    <artifactId>fastjson</artifactId>
    <version>2.0.19</version>
</dependency>

Write configuration, create config package, write ElasticSearchClientConfig.java, connect our es

package com.heze.config;

import org.apache.http.HttpHost;
import org.elasticsearch.client.RestClient;
import org.elasticsearch.client.RestHighLevelClient;
import org.springframework.context.annotation.Bean;
import org.springframework.context.annotation.Configuration;

//1.找对象
//2.放到spring中待用    
@Configuration
public class ElasticSearchClientConfig {
    
    
    
    @Bean
    public RestHighLevelClient restHighLevelClient(){
    
    
        RestHighLevelClient client = new RestHighLevelClient(
                RestClient.builder(
                        new HttpHost("127.0.0.1", 9200, "http")
                )
        );
        return client;
    }
    
}

3. Reptiles

Create a utils package, write HtmlParseUtil.java, and obtain resources from the website: title, img, price

Create a pojo package, write Content.java, including the above three properties

package com.heze.utils;

import com.heze.pojo.Content;
import org.jsoup.Jsoup;
import org.jsoup.nodes.Document;
import org.jsoup.nodes.Element;
import org.jsoup.select.Elements;
import org.springframework.stereotype.Component;

import java.io.IOException;
import java.net.MalformedURLException;
import java.net.URL;
import java.util.ArrayList;
import java.util.List;

/**
 * @author 何泽
 * @version 1.0
 */
@Component
public class HtmlParseUtil {
    
    
    
    public List<Content> parseJD(String keywords) throws IOException {
    
    
        // 获取请求 https://search.jd.com/Search?keyword=java
        String url = "https://search.jd.com/Search?keyword=" + keywords;

        //解析网页 (Jsoup返回的Document就是浏览器Document对象)
        Document document = Jsoup.parse(new URL(url), 30000);
        // 所有你在js中可以使用的方法,这里都能用
        Element elementById = document.getElementById("J_goodsList");
        // 获取所有的li元素
        Elements elements = elementById.getElementsByTag("li");

        ArrayList<Content> goodsList = new ArrayList<>();
        // 获取元素中的内容
        for (Element el : elements) {
    
    
            String img = el.getElementsByTag("img").eq(0).attr("data-lazy-img");
            String price = el.getElementsByClass("p-price").eq(0).text();
            String title = el.getElementsByClass("p-name").eq(0).text();

            Content content = new Content();
            content.setImg(img);
            content.setPrice(price);
            content.setTitle(title);
            goodsList.add(content);
        }
        return goodsList;
    }
}

got the answer:

[External link picture transfer failed, the source site may have an anti-theft link mechanism, it is recommended to save the picture and upload it directly (img-CPVFKz52-1675779830038)(img/image-20230207211244213.png)]

4. Store in es, and search out from it

Create a service folder, create ContentService.java, and call our injected restHighLevelClient below to realize search and highlight

package com.heze.service;

import com.alibaba.fastjson.JSON;
import com.heze.pojo.Content;
import com.heze.utils.HtmlParseUtil;
import org.elasticsearch.action.bulk.BulkRequest;
import org.elasticsearch.action.bulk.BulkResponse;
import org.elasticsearch.action.index.IndexRequest;
import org.elasticsearch.action.search.SearchRequest;
import org.elasticsearch.action.search.SearchResponse;
import org.elasticsearch.client.RequestOptions;
import org.elasticsearch.client.RestHighLevelClient;
import org.elasticsearch.common.text.Text;
import org.elasticsearch.common.unit.TimeValue;
import org.elasticsearch.common.xcontent.XContentType;
import org.elasticsearch.index.query.QueryBuilders;
import org.elasticsearch.index.query.TermQueryBuilder;
import org.elasticsearch.search.SearchHit;
import org.elasticsearch.search.builder.SearchSourceBuilder;
import org.elasticsearch.search.fetch.subphase.highlight.HighlightBuilder;
import org.elasticsearch.search.fetch.subphase.highlight.HighlightField;
import org.springframework.beans.factory.annotation.Autowired;
import org.springframework.stereotype.Service;

import javax.naming.directory.SearchResult;
import java.io.IOException;
import java.util.ArrayList;
import java.util.List;
import java.util.Map;
import java.util.concurrent.TimeUnit;

@Service
public class ContentService {
    
    
    
    @Autowired
    private RestHighLevelClient restHighLevelClient;
    
    
    // 1.解析数据放在 es 索引中
    public Boolean parseContent(String keywords) throws IOException {
    
    
        List<Content> contents = new HtmlParseUtil().parseJD(keywords);
        // 把查询的数据放入 es 中
        BulkRequest bulkRequest = new BulkRequest();
        bulkRequest.timeout("2m");

        for (int i = 0; i < contents.size(); i++) {
    
    
            bulkRequest.add(new IndexRequest("jd_goods")
                    .source(JSON.toJSONString(contents.get(i)), XContentType.JSON)
            );
        }
        BulkResponse bulk = restHighLevelClient.bulk(bulkRequest, RequestOptions.DEFAULT);
        return !bulk.hasFailures();
    }
    
    // 2.获取这些数据实现搜索功能
    public List<Map<String, Object>> searchPage(String keyword, int pageNo, int pageSize) throws IOException {
    
    
        if (pageNo <= 1){
    
    
            pageNo = 1;
        }
        //条件搜索
        SearchRequest searchRequest = new SearchRequest("jd_goods");
        SearchSourceBuilder searchSourceBuilder = new SearchSourceBuilder();
        
        // 分页
        searchSourceBuilder.from(pageNo);
        searchSourceBuilder.size(pageSize);
        
        // 高亮
        HighlightBuilder highlightBuilder = new HighlightBuilder();
        highlightBuilder.field("title");
        highlightBuilder.requireFieldMatch(false); //关闭多个高亮
        highlightBuilder.preTags("<span style='color:red'>");
        highlightBuilder.postTags("</span>");
        searchSourceBuilder.highlighter(highlightBuilder);

        // 精准匹配
        TermQueryBuilder termQueryBuilder = QueryBuilders.termQuery("title", keyword);
        searchSourceBuilder.query(termQueryBuilder);
        searchSourceBuilder.timeout(new TimeValue(60, TimeUnit.SECONDS));
        
        // 执行搜索
        searchRequest.source(searchSourceBuilder);
        SearchResponse searchResponse = restHighLevelClient.search(searchRequest, RequestOptions.DEFAULT);
        
        // 解析结果
        ArrayList<Map<String, Object>> list = new ArrayList<>();
        for (SearchHit documentFields : searchResponse.getHits().getHits()) {
    
    
            
            Map<String, HighlightField> highlightFields = documentFields.getHighlightFields();
            HighlightField title = highlightFields.get("title");
            Map<String, Object> sourceAsMap = documentFields.getSourceAsMap();
            // 解析高亮的字段
            if (title != null){
    
    
                Text[] fragments = title.fragments();
                String name = "";
                for (Text text : fragments) {
    
    
                    name += text;
                }
                sourceAsMap.put("title", name); // 高亮字段替换掉原来的内容即可
            }

            list.add(sourceAsMap);
        }
        return list;
    }
    
    
    
}

5. Use vue for rendering

Go to the front end to look at the search situation:

Create a controller package, write ContentController.java, and realize the search and display results:

package com.heze.controller;

import com.heze.service.ContentService;
import org.springframework.beans.factory.annotation.Autowired;
import org.springframework.web.bind.annotation.GetMapping;
import org.springframework.web.bind.annotation.PathVariable;
import org.springframework.web.bind.annotation.RestController;

import java.io.IOException;
import java.util.List;
import java.util.Map;

@RestController
public class ContentController {
    
    
    
    @Autowired
    private ContentService contentService;
    
    @GetMapping("/parse/{keywords}")
    public Boolean parse(@PathVariable("keywords") String keywords) throws IOException {
    
    
        return contentService.parseContent(keywords);
    }
    
    @GetMapping("/search/{keyword}/{pageNo}/{pageSize}")
    public List<Map<String, Object>> search(@PathVariable("keyword") String keyword,
                                            @PathVariable("pageNo") int pageNo,
                                            @PathVariable("pageSize") int pageSize) throws IOException {
    
    
        
        return contentService.searchPage(keyword, pageNo, pageSize);
        
    }
    
}

[External link picture transfer failed, the source site may have an anti-leeching mechanism, it is recommended to save the picture and upload it directly (img-KDmirrPc-1675779830039)(img/image-20230207210621000.png)]

Create a vue project in any file, npm install vue|npm install axios

Then put the axios.min.js file and vue.min.js under the project resources file to separate the front and back ends

[External link picture transfer failed, the source site may have an anti-leeching mechanism, it is recommended to save the picture and upload it directly (img-AUcuY6GA-1675779830040)(img/image-20230207205924179.png)]

Add asynchronous access to the page to render data

<!DOCTYPE html>
<html xmlns:th="http://www.thymeleaf.org">

<head>
    <meta charset="utf-8"/>
    <title>狂神说Java-ES仿京东实战</title>
    <link rel="stylesheet" th:href="@{/css/style.css}"/>
</head>

<body class="pg">
<div class="page" id="app">
    <div id="mallPage" class=" mallist tmall- page-not-market ">

        <!-- 头部搜索 -->
        <div id="header" class=" header-list-app">
            <div class="headerLayout">
                <div class="headerCon ">
                    <!-- Logo-->
                    <h1 id="mallLogo">
                        <img th:src="@{/images/jdlogo.png}" alt="">
                    </h1>

                    <div class="header-extra">

                        <!--搜索-->
                        <div id="mallSearch" class="mall-search">
                            <form name="searchTop" class="mallSearch-form clearfix">
                                <fieldset>
                                    <legend>天猫搜索</legend>
                                    <div class="mallSearch-input clearfix">
                                        <div class="s-combobox" id="s-combobox-685">
                                            <div class="s-combobox-input-wrap">
                                                <input v-model="keyword" type="text" autocomplete="off" value="dd" id="mq"
                                                       class="s-combobox-input" aria-haspopup="true">
                                            </div>
                                        </div>
                                        <button @click.prevent="searchPage" type="submit" id="searchbtn">搜索</button>
                                    </div>
                                </fieldset>
                            </form>
                            <ul class="relKeyTop">
                                <li><a>狂神说Java</a></li>
                                <li><a>狂神说前端</a></li>
                                <li><a>狂神说Linux</a></li>
                                <li><a>狂神说大数据</a></li>
                                <li><a>狂神聊理财</a></li>
                            </ul>
                        </div>
                    </div>
                </div>
            </div>
        </div>

        <!-- 商品详情页面 -->
        <div id="content">
            <div class="main">
                <!-- 品牌分类 -->
                <form class="navAttrsForm">
                    <div class="attrs j_NavAttrs" style="display:block">
                        <div class="brandAttr j_nav_brand">
                            <div class="j_Brand attr">
                                <div class="attrKey">
                                    品牌
                                </div>
                                <div class="attrValues">
                                    <ul class="av-collapse row-2">
                                        <li><a href="#"> 狂神说 </a></li>
                                        <li><a href="#"> Java </a></li>
                                    </ul>
                                </div>
                            </div>
                        </div>
                    </div>
                </form>

                <!-- 排序规则 -->
                <div class="filter clearfix">
                    <a class="fSort fSort-cur">综合<i class="f-ico-arrow-d"></i></a>
                    <a class="fSort">人气<i class="f-ico-arrow-d"></i></a>
                    <a class="fSort">新品<i class="f-ico-arrow-d"></i></a>
                    <a class="fSort">销量<i class="f-ico-arrow-d"></i></a>
                    <a class="fSort">价格<i class="f-ico-triangle-mt"></i><i class="f-ico-triangle-mb"></i></a>
                </div>

                <!-- 商品详情 -->
                <div class="view grid-nosku">

                    <div class="product" v-for="item in results">
                        <div class="product-iWrap">
                            <!--商品封面-->
                            <div class="productImg-wrap">
                                <a class="productImg">
                                    <img :src="item.img"> 
                                </a>
                            </div>
                            <!--价格-->
                            <p class="productPrice">
                                <em>{
    
    {
    
    item.price}}</em>
                            </p>
                            <!--标题-->
                            <p class="productTitle">
                                <a v-html="item.title"></a>
                            </p>
                            <!-- 店铺名 -->
                            <div class="productShop">
                                <span>店铺: 狂神说Java </span>
                            </div>
                            <!-- 成交信息 -->
                            <p class="productStatus">
                                <span>月成交<em>999</em></span>
                                <span>评价 <a>3</a></span>
                            </p>
                        </div>
                    </div>
                </div>
            </div>
        </div>
    </div>
</div>
<script th:src="@{/js/vue.min.js}"></script>
<script th:src="@{/js/axios.min.js}"></script>
<script>
    new Vue({
    
    
        el:'#app',
        data:{
    
    
            keyword:'',  //搜索关键字
            results:[],  //搜索结果
        },
        methods:{
    
    
            searchPage(){
    
    
                var keyword = this.keyword;
                console.log(keyword)
                axios.get("/search/"+keyword+"/1/10").then(res=>{
    
    
                    console.log(res.data)
                    this.results = res.data;
                })
            }
        }
    })
</script>

</body>
</html>

got the answer:

[External link picture transfer failed, the source site may have an anti-theft link mechanism, it is recommended to save the picture and upload it directly (img-2RHqhA52-1675779830040)(img/image-20230207211141298.png)]

Summary : It is to configure the class to connect to es, pick up three data from the website and store them in es, then search for data from es, and render them through vue. We can use es as an index library (or database) to store content, and then we can get the content in crud to realize our search function. There is no specific feeling yet, but theoretically speaking, it should be better than mysql. Databases are much faster.

Guess you like

Origin blog.csdn.net/qq_51326491/article/details/128927162