Use PubSubHubbub to make RSS timer - Laravel RSS (3)

The next step is to continue to optimize, such as https://feed43.com/, enter the Web URL to generate RSS Feed, and set the update time according to actual needs.

Excerpted from: "Take 2 Hours to Create an RSS Generator" mp.weixin.qq.com/s/mRjoKgkq1…

Today, I will try to complete how to set the update time interval according to actual needs.

Feed updates explained

Since we use xpath to crawl the content of websites, these websites update their content, but they don't tell you that they are updated in real time; so "How does an RSS reader do so-called "updates"? "

To get the article updates of a feed, the most basic and simple way is to regularly visit its RSS address and check whether the corresponding XML file has changed.

At the same time, the second problem arises. Although it is possible to check whether RSS is updated, how to notify our "subscribers" (eg: IFTTT)?

Although Google Reader has been closed, it has left a legacy - the PubSubHubbub agreement. Under the PubSubHubbub protocol, whenever a content publisher (Publisher) publishes new content, it will actively notify a third-party server called the Hub, and the Hub will then send an HTTP POST request to "push the update and the content of the article". " to subscribers who have subscribed to the content source (Subscriber), thus truly achieving "instant" updates.

In practice, RSS services often play the role of subscribers in the above-mentioned three-party relationship, so it is no longer necessary to repeatedly refresh the feed in order to obtain content in a timely manner, but just wait for the update of the content source's active notification from the Hub, and "sit back and enjoy it". This greatly reduces costs and completely eliminates the problem of untimely updates. As for the hub in this agreement, the Hub server, Google built one by itself that year, and it is still in operation; in addition, a company called Superfeedr also publicly provides Hub services.

It should be noted that the so-called "real-time" is only relative, and the notification cannot be sent faster than the time when the RSS service captures the feed update, and we already know that the latter often has an inevitable time difference. Therefore, the role of the real-time push function is only to remind us not to miss the content we care about; to truly achieve minute-level foresight, it is probably more reliable to directly target social networks in today's media ecology.

The above text is more from: "Which mainstream RSS service to choose in 2018? Feedly, Inoreader, and NewsBlur Comprehensive Review" sspai.com/post/44420

Implement update function

With the PubSubHubbub protocol to support our "auto-update" capabilities, we can improve our code.

Add interval field

php artisan make:migration add_interval_to_xpaths_table --table=xpaths

The default interval is 2 hours:

<?php

use Illuminate\Support\Facades\Schema;
use Illuminate\Database\Schema\Blueprint;
use Illuminate\Database\Migrations\Migration;

class AddIntervalToXpathsTable extends Migration
{
    /**
     * Run the migrations.
     *
     * @return void
     */
    public function up()
    {
        Schema::table('xpaths', function (Blueprint $table) {
            $table->integer("interval")->default(2);
        });
    }

    /**
     * Reverse the migrations.
     *
     * @return void
     */
    public function down()
    {
        Schema::table('xpaths', function (Blueprint $table) {
            //
        });
    }
}
php artisan migrate

Add time interval selection box:

$form->select('interval', '更新间隔时间')->options(
    [
        1 => '一个小时', 
        2 => '两个小时',
        4 => '四个小时', 
        8 => '八个小时',
        12 => '半天',
    ]
);

Add Link Header

Tried this one since Superfeedr charges: phubb.cweiske.de/

As required, two Links need to be added to the RSS Header:

<link href="{{ url("/feed/$xpath->id") }}" rel="self" type="application/atom+xml"/>
<link rel="hub" href="http://phubb.cweiske.de/hub.php" />

Add timer

Next, we need to use Laravel's task scheduling function according to the time interval customized by each RSS.

Specific reference: laravel-china.org/docs/larave…

When using the scheduler, simply add the following Cron project to the server.

* * * * * php /path-to-your-project/artisan schedule:run >> /dev/null 2>&1

This cron will call the Laravel command scheduler every minute. When you execute the schedule:run command, Laravel will run the scheduled task according to your schedule.

There will be more RSS for each update interval, and all of them require network requests, so the "queue task scheduling" is used here, and it is executed every hour:

$schedule->job(new AutoUpdateRss(new EloquentRssRepository()))->hourly();

Create a task

To the real core of the place!

Here we use databasethis queue driver. First, we need to create a data table to store tasks. You can use queue:tablethis Artisancommand to create a migration of this data table. Once the migration is created, the data table can be created with migratethis command:

php artisan queue:table
php artisan migrate

Modified Default Queue Driverto database:

'default' => env('QUEUE_DRIVER', 'database'),

Generate task class

php artisan make:job AutoUpdateRss

The AutoUpdateRss class executes the query for xpaths that meet the conditions, triggers the Hub server in real time, and informs the "subscriber" that it is time to update.

<?php

namespace App\Jobs;

use Illuminate\Bus\Queueable;
use Illuminate\Queue\SerializesModels;
use Illuminate\Queue\InteractsWithQueue;
use Illuminate\Contracts\Queue\ShouldQueue;
use Illuminate\Foundation\Bus\Dispatchable;
use App\Repositories\RssRepositoryContract;

class AutoUpdateRss implements ShouldQueue
{
    use Dispatchable, InteractsWithQueue, Queueable, SerializesModels;

    private $rssRC;

    /**
     * Create a new job instance.
     *
     * @return void
     */
    public function __construct(RssRepositoryContract $rssRC)
    {
        $this->rssRC = $rssRC;
    }

    /**
     * Execute the job.
     *
     * @return void
     */
    public function handle()
    {
        $xpaths = $this->rssRC->query();
        if (empty($xpaths) || count($xpaths) == 0) {
            return;
        }
        $this->rssRC->update($xpaths);
    }
}

For an introduction to Laravel's queue system, see this: laravel-china.org/docs/larave…

query method

public function query() {
    $allXpaths = Xpath::all();

    $now = Carbon::now();

    $xpaths = $allXpaths->filter(function ($value, $key) use ($now) {
        $diff = $now->diffInHours(Carbon::parse($value->created_at));
        return $diff % $value->interval == 0;
    });

    return $xpaths;
}

Notification Hub method

public function update($xpaths) {
    $client = new Client();

    $requests = function ($xpaths) {
        $uri = 'http://phubb.cweiske.de/hub.php';
        foreach($xpaths as $xpath) {
            yield new Request('POST', $uri, [
                    'form_params' => [
                        'hub.mode' => 'publish',
                        'hub.url' => url("/feed/$xpath->id")
                    ]
                ]
            );
        }
    };

    $pool = new Pool($client, $requests($xpaths), [
        'concurrency' => 5,
        'fulfilled' => function ($response, $index) {
            // this is delivered each successful response
        },
        'rejected' => function ($reason, $index) {
            // this is delivered each failed request
        },
    ]);

    // Initiate the transfers and create a promise
    $promise = $pool->promise();

    // Force the pool of requests to complete.
    $promise->wait();
}

Here we use Guzzle guzzle-cn.readthedocs.io/zh_CN/lates… , using multi-threaded asynchronous requests to improve efficiency.

// 安装 guzzle 插件
composer require guzzlehttp/guzzle

For more use of Guzzle, refer to: "Recommend a PHP Network Request Plugin Guzzle" mp.weixin.qq.com/s/w2I8hUmHu…

test

Everything is ready, only the test is left, and the effect of running is running. We change the time to every five minutes to update the RSS. We still use the method of "IFTTT to connect Dingding" to see the effect:

Summarize

The code is rough, but the five internal organs are complete. This process mainly uses several core technologies and tools:

  1. Laravel's task scheduling
  2. Laravel's queue
  3. Guzzle network request plugin
  4. phubb - PHP PubSubHubbub server

Let our RSS feed have the timer update function, the next step can start to try to write the front-end, make a website tool for more people to use.

The specific code has been synchronized to github: github.com/fanly/lrss

To be continued

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=325382323&siteId=291194637