推特API部分参数过滤规则

版权声明:本文为博主原创文章,未经博主允许不得转载。 https://blog.csdn.net/qq_27378621/article/details/85263566

1.实时推文

https://developer.twitter.com/en/docs/tweets/filter-realtime/guides/basic-stream-parameters

track

A comma-separated list of phrases which will be used to determine what Tweets will be delivered on the stream. A phrase may be one or more terms separated by spaces, and a phrase will match if all of the terms in the phrase are present in the Tweet, regardless of order and ignoring case. By this model, you can think of commas as logical ORs, while spaces are equivalent to logical ANDs (e.g. ‘the twitter’ is the AND twitter, and ‘the,twitter’ is the OR twitter).(逗号可以视为‘或’,空格可以视为‘与’)

The text of the Tweet and some entity fields are considered for matches. Specifically, the text attribute of the Tweet, expanded_url and display_url for links and media, text for hashtags, and screen_name for user mentions are checked for matches.

Each phrase must be between 1 and 60 bytes, inclusive.

Exact matching of phrases (equivalent to quoted phrases in most search engines) is not supported.

Punctuation and special characters will be considered part of the term they are adjacent to. In this sense, “hello.” is a different track term than “hello”. However, matches will ignore punctuation present in the Tweet. So “hello” will match both “hello world” and “my brother says hello.” Note that punctuation is not considered to be part of a #hashtag or @mention, so a track term containing punctuation will not match either #hashtags or @mentions.

UTF-8 characters will match exactly, even in cases where an “equivalent” ASCII character exists. For example, “touché” will not match a Tweet containing “touche”.

Non-space separated languages, such as CJK are currently unsupported.

URLs are considered words for the purposes of matches which means that the entire domain and path must be included in the track query for a Tweet containing an URL to match. Note that display_url does not contain a protocol, so this is not required to perform a match.

Twitter currently canonicalizes the domain “www.example.com” to “example.com” before the match is performed, so omit the “www” from URL track terms.

Finally, to address a common use case where you may want to track all mentions of a particular domain name (i.e., regardless of subdomain or path), you should use “example com” as the track parameter for “example.com” (notice the lack of period between “example” and “com” in the track parameter). This will be over-inclusive, so make sure to do additional pattern-matching in your code. See the table below for more examples related to this issue.

Track examples:

Parameter value Will match... Will not match...
Twitter

TWITTERtwitter “Twitter” twitter. #twitter @twitter http://twitter.com

TwitterTracker#newtwitter

Twitter’s I like Twitter’s new design Someday I’d like to visit @Twitter’s office
twitter api,twitter streaming

The Twitter API is awesomeThe twitter streaming service is fast Twitter has a streaming API

I’m new to Twitter
example.com Someday I will visit example.com There is no example.com/foobarbaz
example.com/foobarbaz

example.com/foobarbazwww.example.com/foobarbaz

example.com
www.example.com/foobarbaz   www.example.com/foobarbaz
example com

example.comwww.example.com foo.example.com foo.example.com/bar I hope my startup isn’t merely another example of a dot com boom!

 

2.人的历史推文

https://developer.twitter.com/en/docs/tweets/timelines/api-reference/get-statuses-user_timeline.html

Parameters

Name Required Description Default Value Example
user_id optional The ID of the user for whom to return results.   12345
screen_name optional The screen name of the user for whom to return results.   noradio
since_id optional Returns results with an ID greater than (that is, more recent than) the specified ID. There are limits to the number of Tweets that can be accessed through the API. If the limit of Tweets has occured since the since_id, the since_id will be forced to the oldest ID available.   12345
count optional Specifies the number of Tweets to try and retrieve, up to a maximum of 200 per distinct request. The value of count is best thought of as a limit to the number of Tweets to return because suspended or deleted content is removed after the count has been applied. We include retweets in the count, even if include_rts is not supplied. It is recommended you always send include_rts=1 when using this API method.    
max_id optional Returns results with an ID less than (that is, older than) or equal to the specified ID.   54321
trim_user optional When set to either true , t or 1 , each Tweet returned in a timeline will include a user object including only the status authors numerical ID. Omit this parameter to receive the complete user object.   true
exclude_replies optional This parameter will prevent replies from appearing in the returned timeline. Using exclude_replies with the countparameter will mean you will receive up-to count tweets — this is because the count parameter retrieves that many Tweets before filtering out retweets and replies.   true
include_rts optional When set to false , the timeline will strip any native retweets (though they will still count toward both the maximal length of the timeline and the slice selected by the count parameter). Note: If you're using the trim_user parameter in conjunction with include_rts, the retweets will still contain a full user object.   false

3.关键词历史推文(搜索推文)

https://developer.twitter.com/en/docs/tweets/search/guides/standard-operators

此方法不支持使用逗号 “,” 标识关键词的或关系。

Limit your searches to 10 keywords and operators(搜索词数量限制在10个以内)

Standard search operators

The query can have operators that modify its behavior.  Below are examples that illustrate the available operators in standard search:

Operator Finds Tweets...
watching now containing both “watching” and “now”. This is the default operator.
“happy hour” containing the exact phrase “happy hour”.
love OR hate containing either “love” or “hate” (or both).
beer -root containing “beer” but not “root”.
#haiku containing the hashtag “haiku”.
from:interior sent from Twitter account “interior”.
list:NASA/astronauts-in-space-now sent from a Twitter account in the NASA list astronauts-in-space-now
to:NASA a Tweet authored in reply to Twitter account “NASA”.
@NASA mentioning Twitter account “NASA”.
politics filter:safe containing “politics” with Tweets marked as potentially sensitive removed.
puppy filter:media containing “puppy” and an image or video.
puppy -filter:retweets containing “puppy”, filtering out retweets
puppy filter:native_video containing “puppy” and an uploaded video, Amplify video, Periscope, or Vine.
puppy filter:periscope containing “puppy” and a Periscope video URL.
puppy filter:vine containing “puppy” and a Vine.
puppy filter:images containing “puppy” and links identified as photos, including third parties such as Instagram.
puppy filter:twimg containing “puppy” and a pic.twitter.com link representing one or more photos.
hilarious filter:links containing “hilarious” and linking to URL.
puppy url:amazon containing “puppy” and a URL with the word “amazon” anywhere within it.
superhero since:2015-12-21 containing “superhero” and sent since date “2015-12-21” (year-month-day).
puppy until:2015-12-21 containing “puppy” and sent before the date “2015-12-21”.
movie -scary :) containing “movie”, but not “scary”, and with a positive attitude.
flight :( containing “flight” and with a negative attitude.
traffic ? containing “traffic” and asking a question.

猜你喜欢

转载自blog.csdn.net/qq_27378621/article/details/85263566