[Python の宝箱] ソーシャルメディアデータの海を詳しく調べる: Python ツールが分析への扉を開く

ソーシャルメディアデータの大きな秘密: Python ツールとテクニックの完全な分析

序文

デジタル時代では、ソーシャルメディアは世界をつなぐ役割を果たしており、この広大で複雑なネットワークを深く理解することが、現在のトレンドとユーザーの行動を解釈する鍵となります。この記事では、ソーシャルメディアデータを簡単に活用し、その深い洞察を明らかにするのに役立つ一連の強力な Python ツールとテクニックを紹介します。

【Pythonの玉手箱】ネットワーク分析に挑戦：NetworkX、iGraph、Graph-tool、Snap.py、PyGraphvizの詳細評価

コラムの購読を歓迎します: Python ライブラリの宝箱: プログラミングの魔法の世界のロックを解除する

記事ディレクトリ

**ソーシャルメディアデータが明らかに: Python ツールとテクニックの完全な分析**

1.トゥイーピー

1.1 API認証と基本的な使い方

Tweepy は Twitter API にアクセスするための Python ライブラリです。まずはAPI認証を行い、Twitter開発者アカウントのAPIキーとアクセストークンを取得します。次に、Tweepy を使用して基本的な使用法のデモンストレーションを実行します。

import tweepy

# API认证
consumer_key = 'Your_Consumer_Key'
consumer_secret = 'Your_Consumer_Secret'
access_token = 'Your_Access_Token'
access_token_secret = 'Your_Access_Token_Secret'

auth = tweepy.OAuthHandler(consumer_key, consumer_secret)
auth.set_access_token(access_token, access_token_secret) 

api = tweepy.API(auth)

# 获取用户信息
user = api.get_user(screen_name='twitter_handle')
print(f"User: {
      
      user.screen_name}, Followers: {
      
      user.followers_count}")

# 发送一条推文
api.update_status("Hello, Twitter API!")

このコードは、Tweepy の API 認証と、ユーザー情報の取得やツイートの送信などの基本的な使用方法を示しています。

1.2 データ収集と分析手法

Tweepyでは、ユーザーのタイムラインを取得したり、特定のキーワードでツイートを検索したりするなど、Twitterのデータを収集・分析するためのさまざまな方法を提供しています。簡単な例を次に示します。

# 获取用户时间线
tweets = api.user_timeline(screen_name='twitter_handle', count=10)

for tweet in tweets:
    print(f"{
      
      tweet.user.screen_name}: {
      
      tweet.text}")

# 搜索关键词
search_results = api.search(q='python', count=5)

for result in search_results:
    print(f"{
      
      result.user.screen_name}: {
      
      result.text}")

このコードは、Tweepy を使用してユーザーのタイムラインからツイートを取得し、キーワードを検索する方法を示しています。

1.3 リアルタイムのデータストリーム取得

Tweepy はリアルタイムデータストリームの取得もサポートしており、StreamListener を通じてリアルタイムで生成されたツイートを処理できます。簡単な例を次に示します。

from tweepy.streaming import StreamListener
from tweepy import Stream

class MyStreamListener(StreamListener):
    def on_status(self, status):
        print(f"{
      
      status.user.screen_name}: {
      
      status.text}")

# 创建Stream对象并启动实时数据流
my_stream_listener = MyStreamListener()
my_stream = Stream(auth=api.auth, listener=my_stream_listener)

# 过滤包含关键词'python'的推文
my_stream.filter(track=['python'])

上記のコードを使用すると、キーワード「python」を含むツイートをリアルタイムで取得できます。これは、Tweepy のリアルタイムデータストリーミング機能を示しています。

1.4 ユーザーのやり取りと傾向を分析する

Tweepyはユーザーの情報やツイートを取得できるだけでなく、ユーザーのやりとりや注目傾向を分析することができます。サンプルコードは次のとおりです。

# 获取用户关注者列表
followers = api.followers(screen_name='twitter_handle', count=5)

print(f"Followers of twitter_handle:")
for follower in followers:
    print(f"{
      
      follower.screen_name}")

# 分析用户的互动
interactions = api.user_timeline(screen_name='twitter_handle', count=100)

likes_count = 0
retweets_count = 0

for tweet in interactions:
    likes_count += tweet.favorite_count
    retweets_count += tweet.retweet_count

print(f"Total Likes: {
      
      likes_count}, Total Retweets: {
      
      retweets_count}")

このコードは、Tweepy を使用してユーザーのフォロワーリストを取得し、いいねやリツイートなどのユーザーのインタラクションデータを分析する方法を示しています。

1.5 カーソルを使用して大量のデータを処理する

大量のデータを処理するために、Tweepy は結果セットを簡単に走査するための Cursor を提供します。ユーザーからすべてのツイートを取得する例を次に示します。

# 使用Cursor获取用户所有推文
all_tweets = tweepy.Cursor(api.user_timeline, screen_name='twitter_handle').items()

for tweet in all_tweets:
    print(f"{
      
      tweet.user.screen_name}: {
      
      tweet.text}")

このコードは、Tweepy の Cursor を使用してユーザーのすべてのツイートを取得し、大量のデータの処理を容易にする方法を示しています。

1.6 データの視覚化と洞察

Tweepy と Matplotlib や Seaborn などのデータ視覚化ツールを組み合わせると、分析結果をより直感的に表示できます。簡単な例を次に示します。

import matplotlib.pyplot as plt

# 统计用户互动数据
labels = ['Likes', 'Retweets'] 
counts = [likes_count, retweets_count]

plt.bar(labels, counts, color=['blue', 'green'])
plt.title('User Interaction Analysis')
plt.xlabel('Interaction Type')
plt.ylabel('Count')
plt.show()

このコードは、Matplotlib を使用してユーザーインタラクションデータをカウントし、単純なデータ視覚化を実行する方法を示します。

2. Python-twitter

2.1 インターフェースの呼び出しと権限の設定

python-twitterTwitter API にアクセスするために使用されるライブラリのもう 1 つで、使用する前に API 認証を実行し、権限を設定する必要があります。以下は、基本的なインターフェイス呼び出しと権限の構成です。

import twitter

# API认证
api = twitter.Api(
    consumer_key='Your_Consumer_Key',
    consumer_secret='Your_Consumer_Secret',
    access_token_key='Your_Access_Token',
    access_token_secret='Your_Access_Token_Secret'
)

# 获取用户信息
user = api.GetUser(screen_name='twitter_handle')
print(f"User: {
      
      user.screen_name}, Followers: {
      
      user.followers_count}")

このコードは、API 認証と基本的なインターフェイス呼び出しの使用法を示していますpython-twitter。

2.2 ユーザー情報と投稿データの取得

python-twitterユーザー情報の取得やデータの投稿に利用できます。以下に例を示します。

# 获取用户信息
user = api.GetUser(screen_name='twitter_handle')
print(f"User: {
      
      user.screen_name}, Followers: {
      
      user.followers_count}")

# 获取用户的帖子
statuses = api.GetUserTimeline(screen_name='twitter_handle', count=5)

for status in statuses:
    print(f"{
      
      status.user.screen_name}: {
      
      status.text}")

このコードは、ユーザー情報を取得してデータを投稿する方法を示していますpython-twitter。

2.2.1 データのクリーニングと処理手法

取得した投稿データには、データのクリーニングや加工が必要となる場合があります。簡単な掃除のヒントは次のとおりです。

import re

# 清洗推文文本
cleaned_tweets = [re.sub(r'http\S+', '', status.text) for status in statuses]

for tweet in cleaned_tweets:
    print(tweet)

このコードは、正規表現を使用してツイートテキストをクリーンアップし、URL を削除する方法を示しています。

2.3 パブリッシュと対話型操作

python-twitterツイートやインタラクティブなアクションもサポートされています。ツイートといいねの例は次のとおりです。

# 发布推文
new_status = api.PostUpdate("Hello, python-twitter!")

# 点赞帖子
api.CreateFavorite(status=new_status)

このコードは、python-twitterツイートといいね! を使用する基本的な操作を示しています。

2.4 メディアコンテンツの処理

python-twitter画像やビデオのアップロードなど、メディアコンテンツの処理をサポートします。画像をアップロードする例を次に示します。

# 上传图片
with open('path/to/image.jpg', 'rb') as file:
    media = api.UploadMediaChunked(file)

# 发布带有图片的推文
api.PostUpdate("Check out this image!", media=media.media_id)

python-twitterこのコードは、画像をアップロードしてツイートで共有する方法を示しています。

2.5 ライブツイートストリーミング

python-twitterまた、リアルタイムのツイートストリームの取得もサポートされており、Streamクラスとを通じて処理されますStreamListener。以下は、キーワードを含むライブツイートをリッスンする例です。

class MyStreamListener(twitter.StreamListener):
    def on_status(self, status):
        print(f"{
      
      status.user.screen_name}: {
      
      status.text}")

# 创建Stream对象并启动实时推文流
my_stream_listener = MyStreamListener(api=api)
my_stream = twitter.Stream(auth=api.auth, listener=my_stream_listener)

# 过滤包含关键词'python'的实时推文
my_stream.filter(track=['python'])

このコードを使用すると、キーワード「python」を含むツイートをリアルタイムで取得できます。

2.6 高度な検索とフィルタリング

python-twitterさまざまなニーズに応える豊富な検索機能とフィルタリング機能を提供します。高度な検索の例を次に示します。

# 高级搜索
search_results = api.GetSearch(
    term='python',
    lang='en',
    result_type='recent',
    count=5
)

for result in search_results:
    print(f"{
      
      result.user.screen_name}: {
      
      result.text}")

python-twitterこのコードは、言語と結果の種類の指定を含む、高度な検索を実行する方法を示しています。

3.フェイスブックSDK

3.1 認証と権限管理

facebook-sdkFacebook Graph API にアクセスするには、まず認証して権限を構成する必要があります。以下は、認証と権限管理の簡単な例です。

import facebook

# 获取用户长期访问令牌
app_id = 'Your_App_ID'
app_secret = 'Your_App_Secret'
user_access_token = 'User_Access_Token'

graph = facebook.GraphAPI(access_token=user_access_token, version='v14.0')

このコードは、権限を認証して構成する方法を示していますfacebook-sdk。

3.2 ユーザー情報と投稿データの取得

facebook-sdkユーザー情報の取得やデータの投稿に使用できます。以下に例を示します。

# 获取用户信息
user_info = graph.get_object('me')
print(f"User: {
      
      user_info['name']}, ID: {
      
      user_info['id']}")

# 获取用户发布的帖子
user_posts = graph.get_connections('me', 'posts')

for post in user_posts['data']:
    print(f"{
      
      post['from']['name']}: {
      
      post['message']}")

このコードは、ユーザー情報を取得してデータを投稿する方法を示していますfacebook-sdk。

3.3 データ分析と洞察

facebook-sdk他のデータ分析ツールと組み合わせて、より深い洞察を得ることができます。簡単な例を次に示します。

import pandas as pd

# 将帖子数据转换为DataFrame
posts_df = pd.DataFrame(user_posts['data'])

# 分析帖子数据
post_analysis = posts_df.groupby('from')['message'].count().reset_index()
print(post_analysis)

facebook-sdkこのコードは、取得した投稿データを使用して簡単な分析を実行する方法を示しています。

3.4 パブリッシュと対話型操作

facebook-sdk投稿の公開と、「いいね！」、コメントなどの対話型操作の実行をサポートします。投稿といいねの例は次のとおりです。

# 发布帖子
post_message = "Hello, Facebook Graph API!"
graph.put_object(parent_object='me', connection_name='feed', message=post_message)

# 获取帖子ID
last_post_id = graph.get_connections('me', 'posts')['data'][0]['id']

# 点赞帖子
graph.put_like(object_id=last_post_id)

このコードは、facebook-sdk投稿と「いいね！」の公開を使用した基本的な操作を示しています。

3.5 写真、ビデオ、ファイルのアップロード

facebook-sdk写真、ビデオなどを含むマルチメディアファイルのアップロードもサポートしています。画像をアップロードする例を次に示します。

# 上传图片
with open('path/to/image.jpg', 'rb') as photo:
    graph.put_photo(image=photo, message='Check out this photo!')

facebook-sdkこのコードは、アップロード画像の使用方法を示しています。

3.6 データの視覚化とレポートの生成

データ視覚化ツールと組み合わせるとfacebook-sdk、魅力的なグラフやレポートを作成できます。簡単な例を次に示します。

import matplotlib.pyplot as plt

# 将帖子数据可视化
post_analysis.plot(kind='bar', x='from', y='message', legend=False)
plt.title('User Post Analysis')
plt.xlabel('User')
plt.ylabel('Post Count')
plt.show()

このコードは、Matplotlib を使用してポストデータを視覚化する方法を示します。

3.7 高度な機能と拡張機能

facebook-sdkイベント管理、広告操作などを含む、多くの高度な機能と拡張オプションを提供します。簡単な例を次に示します。

# 获取用户的事件
user_events = graph.get_connections('me', 'events')

for event in user_events['data']:
    print(f"Event Name: {
      
      event['name']}, Location: {
      
      event.get('location', 'N/A')}")

このコードは、ユーザーイベント情報を取得する方法を示していますfacebook-sdk。

4.インスタローダー

4.1 写真、ビデオ、投稿のダウンロード

InstaloaderInstagramデータをダウンロードするツールで、写真、動画、投稿のダウンロードをサポートします。簡単な例を次に示します。

from instaloader import Instaloader, Profile

# 创建Instaloader对象
loader = Instaloader()

# 获取用户信息
profile = Profile.from_username(loader.context, 'instagram_handle')
print(f"User: {
      
      profile.username}, Followers: {
      
      profile.followers}")

# 下载用户的图片和视频
loader.download_profile(profile.username, profile_pic_only=False)

Instaloaderこのコードは、 Instagram ユーザーの画像とビデオをダウンロードする方法を示しています。

4.2 ユーザー情報とインタラクションデータの抽出

Instaloaderユーザー情報とインタラクションデータの抽出もサポートします。以下に例を示します。

# 获取用户信息
profile = Profile.from_username(loader.context, 'instagram_handle') 
print(f"User: {
      
      profile.username}, Followers: {
      
      profile.followers}")

# 获取用户的互动数据
likes_count = 0
comments_count = 0

for post in profile.get_posts():
    likes_count += post.likes
    comments_count += post.comments

print(f"Total Likes: {
      
      likes_count}, Total Comments: {
      
      comments_count}")

このコードは、Instagram のユーザー情報とインタラクションデータを取得する方法を示していますInstaloader。

4.3 データ処理とプレゼンテーションのスキル

ダウンロードしたデータは、他のライブラリを通じて処理および表示できます。以下はmatplotlib簡単なデモンストレーションの例です。

import matplotlib.pyplot as plt

# 帖子类型分布展示
post_types = ['Image', 'Video'] 
post_counts = [profile.mediacount - profile.video_count, profile.video_count]

plt.bar(post_types, post_counts, color=['blue', 'orange'])
plt.title('Post Type Distribution')
plt.xlabel('Post Type')
plt.ylabel('Count')
plt.show()

matplotlibこのコードは、 Show Download Instagram 投稿タイプのディストリビューションの使用方法を示しています。

4.4 ユーザーのアクティビティパターンを分析する

Instaloaderデータをダウンロードできるだけでなく、ユーザーの活動パターンの分析にも役立ちます。以下に例を示します。

# 获取用户的帖子
posts = list(profile.get_posts())

# 计算每个月的平均帖子数量
monthly_post_count = {
    
    }
for post in posts:
    month_year = post.date.strftime("%Y-%m")
    monthly_post_count[month_year] = monthly_post_count.get(month_year, 0) + 1

# 展示月均帖子数量
months = list(monthly_post_count.keys())
post_counts = list(monthly_post_count.values())

plt.plot(months, post_counts, marker='o', linestyle='-')
plt.title('Monthly Average Post Count')
plt.xlabel('Month')
plt.ylabel('Average Post Count')
plt.xticks(rotation=45)
plt.show()

Instaloaderこのコードは、ユーザーの投稿を取得し、1 か月あたりの平均投稿数を分析する方法を示しています。

4.5 複数のユーザーデータを一括ダウンロードする

Instaloader複数ユーザーのデータの一括ダウンロードをサポートします。ユーザー画像を一括ダウンロードする例を以下に示します。

users_to_download = ['user1', 'user2', 'user3'] 

for user in users_to_download:
    try:
        profile = Profile.from_username(loader.context, user)
        loader.download_profile(profile.username, profile_pic_only=True)
        print(f"Downloaded profile pictures for {
      
      profile.username}")
    except Exception as e:
        print(f"Error downloading data for {
      
      user}: {
      
      e}")

Instaloaderこのコードは、複数のユーザーのアバター画像をバッチでダウンロードする方法を示しています。

4.6 ダウンロードにプロキシを使用する

一部のネットワーク環境では、ダウンロードにプロキシの使用が必要な場合があります。プロキシを使用してダウンロードする例を次に示します。

from  instaloader import InstaloaderContext

# 设置代理
context = InstaloaderContext(requests_session=requests.Session(), proxy="http://your_proxy_here")

# 创建带代理的Instaloader对象
loader_with_proxy = Instaloader(context=context)

# 下载用户的图片和视频
loader_with_proxy.download_profile(profile.username, profile_pic_only=False)

Instaloaderこのコードは、データのダウンロードにプロキシを使用する方法を示しています。

上記はInstaloader、ユーザーアクティビティ分析、バッチダウンロード、プロキシの使用など、このペアをさらに拡張したものです。

5.ソーシャルメディアマインR

5.1 ソーシャルメディアデータマイニングツールの概要

SocialMediaMineRこれはソーシャルメディアデータマイニング用のツールであり、複数のプラットフォームをサポートしています。以下に簡単に紹介します。

from SocialMediaMineR import SocialMediaMiner

# 创建SocialMediaMiner对象
miner = SocialMediaMiner(api_key='Your_API_Key')

# 获取Twitter上特定关键词的推文
tweets = miner.search_tweets(query='data mining', count=5)

for tweet in tweets:
    print(f"{
      
      tweet['user']['screen_name']}: {
      
      tweet['text']}")

SocialMediaMineRこのコードは、 Twitter で特定のキーワードのツイートを取得する方法を示します。

5.2 データ収集および分析機能

SocialMediaMineRユーザー情報や投稿データなどの豊富なデータ取得・分析機能を提供します。以下に例を示します。

# 获取用户信息
user_info = miner.get_user_info(screen_name='twitter_handle')
print(f"User: {
      
      user_info['screen_name']}, Followers: {
      
      user_info['followers_count']}")

# 获取用户的帖子
user_posts = miner.get_user_posts(screen_name='twitter_handle', count=5)

for post in user_posts:
    print(f"{
      
      post['user']['screen_name']}: {
      
      post['text']}")

このコードは、ユーザー情報を取得してデータを投稿する方法を示していますSocialMediaMineR。

5.3 データの視覚化と応用事例

データ視覚化は、SocialMediaMineRマイニングされたソーシャルメディアデータを表示するために使用できる強力な機能です。簡単な例を次に示します。

import matplotlib.pyplot as plt

# 统计推文来源
source_counts = miner.count_tweet_sources(query='data mining', count=100)

plt.pie(source_counts.values(), labels=source_counts.keys(), autopct='%1.1f%%')
plt.title('Tweet Sources Distribution')
plt.show()

SocialMediaMineRこのコードは、統計ツイートソースを使用して視覚化する方法を示しています。

5.4 マイニングユーザー関係ネットワーク

# 获取用户的关注者和关注的用户
user_followers = miner.get_user_followers(screen_name='twitter_handle', count=10)
user_following = miner.get_user_following(screen_name='twitter_handle', count=10)

print(f"Followers: {
      
      user_followers}")
print(f"Following: {
      
      user_following}")

5.5 感情分析とトピックの特定

# 对推文进行情感分析
sentiment_analysis = miner.sentiment_analysis(query='data mining', count=50)

positive_tweets = sum(1 for sentiment in sentiment_analysis if sentiment == 'positive')
negative_tweets = sum(1 for sentiment in sentiment_analysis if sentiment == 'negative')

print(f"Positive Tweets: {
      
      positive_tweets}, Negative Tweets: {
      
      negative_tweets}")

5.6 スケジュールされたタスクと自動化

from apscheduler.schedulers.blocking import BlockingScheduler

# 创建定时任务
scheduler = BlockingScheduler()

# 定义定时任务函数
def scheduled_job():
    tweets = miner.search_tweets(query='automation', count=5)
    for tweet in tweets:
        print(f"{
      
      tweet['user']['screen_name']}: {
      
      tweet['text']}")

# 每天定时执行任务
scheduler.add_job(scheduled_job, 'interval', days=1)

# 启动定时任务
scheduler.start()

上記はSocialMediaMineR、ユーザー関係ネットワークのマイニング、センチメント分析、スケジュールされたタスク、自動化など、このペアをさらに拡張したものです。

6.LAWS (Python Reddit API ラッパー)

6.1 Reddit API接続と利用方法

PRAWReddit APIにアクセスするためのPythonパッケージで、投稿やコメントなどの情報の取得をサポートします。Reddit API に接続して使用する基本的な方法は次のとおりです。

import praw

# Reddit API认证
reddit = praw.Reddit(
    client_id='Your_Client_ID', 
    client_secret='Your_Client_Secret',
    user_agent='Your_User_Agent'
)

# 获取特定subreddit下的热门帖子
subreddit = reddit.subreddit('python')
hot_posts = subreddit.hot(limit=5)

for post in hot_posts:
    print(f"Title: {
      
      post.title}, Upvotes: {
      
      post.ups}")

PRAWこのコードは、Reddit API を使用して特定のサブレディットで人気のある投稿を認証し、取得する方法を示しています。

6.2 投稿およびコメントデータの抽出

PRAW投稿やコメントなどのデータを抽出するために使用できます。以下に例を示します。

# 获取帖子信息
post = reddit.submission(id='post_id')
print(f"Title: {
      
      post.title}, Comments: {
      
      post.num_comments}")

# 获取帖子的评论
comments = post.comments.list()

for comment in comments:
    print(f"{
      
      comment.author.name}: {
      
      comment.body}")

このコードは、投稿情報とコメントデータを取得する方法を示していますPRAW。

6.3 Redditのデータ分析と視覚化のスキル

取得したPRAWRedditデータは他のライブラリと組み合わせて分析・可視化することができます。簡単な例を次に示します。

import matplotlib.pyplot as plt

# 统计帖子类型分布
post_types = ['Link', 'Text', 'Image']
post_counts = [subreddit.link_karma, subreddit.comment_karma, subreddit.total_karma]

plt.bar(post_types, post_counts, color=['red', 'green', 'blue'])
plt.title('Post Type Distribution')
plt.xlabel('Post Type')
plt.ylabel('Karma')
plt.show()

このコードは、分析を使用して Reddit 投稿タイプの分布を視覚化する方法を示していますmatplotlib。

6.4 ユーザーインタラクション分析

# 获取用户的帖子和评论
user = reddit.redditor('username')
user_posts = list(user.submissions.new(limit=5))
user_comments = list(user.comments.new(limit=5))

print(f"User: {
      
      user.name}, Total Posts: {
      
      len(user_posts)}, Total Comments: {
      
      len(user_comments)}")

6.5 複数のサブレディットにわたる傾向を調査する

# 定义Subreddit列表
subreddits = ['python', 'datascience', 'machinelearning']

# 统计各Subreddit的帖子数量
subreddit_post_counts = {
    
    subreddit: reddit.subreddit(subreddit).submissions.new(limit=10) for subreddit in subreddits}

for subreddit, posts in subreddit_post_counts.items():
    print(f"{
      
      subreddit} Posts:")
    for post in posts:
        print(f"  - {
      
      post.title}")

6.6 Reddit ボットと自動化

# 创建Reddit机器人
reddit_bot = praw.Reddit(
    client_id='Bot_Client_ID',
    client_secret='Bot_Client_Secret',
    user_agent='Bot_User_Agent',
    username='Bot_Username',
    password='Bot_Password'
)

# 发送帖子
subreddit = reddit_bot.subreddit('test')
subreddit.submit(title='Automated Post', selftext='This post was created by a bot.')

これは上記をさらに拡張したPRAWもので、ユーザーインタラクション分析、複数のサブレディットにわたるトレンド調査、レディットボットと自動化が含まれます。

7. フェイスピー

7.1 Facebook Graph APIの使用方法

Facepyこれは、Facebook Graph API にアクセスするために使用される Python ライブラリであり、ユーザー情報、投稿、その他のデータの取得をサポートします。簡単な例を次に示します。

from facepy import GraphAPI

# Facebook Graph API认证
access_token = 'Your_Access_Token'
graph = GraphAPI(access_token)

# 获取用户信息
user_info = graph.get('me')
print(f"User: {
      
      user_info['name']}, ID: {
      
      user_info['id']}")

Facepyこのコードは、Facebook Graph API を使用して認証し、ユーザー情報を取得する方法を示しています。

7.2 データの取得と分析の手法

Facepyユーザー投稿の取得など、データの取得と分析をサポートします。以下に例を示します。

# 获取用户发布的帖子
user_posts = graph.get('me/posts', limit=5)

for post in user_posts['data']: 
    print(f"{
      
      post['from']['name']}: {
      
      post['message']}")

Facepyこのコードは、ユーザーの投稿データを取得する方法を示しています。

7.3 ユーザー対話とコンテンツ公開操作

Facepyユーザー操作やコンテンツ公開操作もサポートされています。投稿といいねの例は次のとおりです。

# 发布帖子
new_post = graph.post('me/feed', message='Hello, Facebook Graph API!')

# 点赞帖子
graph.post(f'{
      
      new_post["id"]}/likes')

Facepyこのコードは、投稿を公開して「いいね!」するという基本操作の使用方法を示しています。

7.4 ユーザーの友達リストを取得する

# 获取用户的朋友列表
friends = graph.get('me/friends')

for friend in friends['data']:
    print(f"Friend: {
      
      friend['name']}, ID: {
      
      friend['id']}")

7.5 インタラクション後のデータを分析する

# 获取帖子的点赞和评论数量
post_id = 'post_id_here'
post_interactions = graph.get(f'{
      
      post_id}?fields=likes.summary(true),comments.summary(true)')

likes_count = post_interactions['likes']['summary']['total_count']
comments_count = post_interactions['comments']['summary']['total_count']

print(f"Likes: {
      
      likes_count}, Comments: {
      
      comments_count}")

7.6 データを使用してユーザー関係を分析する

# 获取用户的好友及其朋友列表
friends_and_friends_of_friends = []
for friend in friends['data']:
    friend_id = friend['id']
    friend_friends = graph.get(f'{
      
      friend_id}/friends')['data']
    friends_and_friends_of_friends.extend((friend_id, friend_friend['id']) for friend_friend in friend_friends)

print("User and Friends of Friends:")
for pair in friends_and_friends_of_friends:
    print(pair)

上記はFacepyさらに拡張されたもので、ユーザーの友人リストの取得、インタラクション後のデータの分析、ユーザー関係の分析のためのデータの使用が含まれます。

8. トゥイーピーストリーミング

8.1 ストリーミングデータの取得と処理

tweepy-streamingこれは、Tweepy 用のストリーミングデータ取得拡張機能であり、リアルタイムで生成されたツイートを処理するために使用されます。簡単な例を次に示します。

from tweepy.streaming import StreamListener
from tweepy import Stream

class MyStreamListener(StreamListener):
    def on_status(self, status):
        print(f"{
      
      status.user.screen_name}: {
      
      status.text}")

# 创建Stream对象并启动实时数据流
my_stream_listener = MyStreamListener()
my_stream = Stream(auth=api.auth, listener=my_stream_listener)

# 过滤包含关键词'python'的推文
my_stream.filter(track=['python'])

tweepy-streamingこのコードは、リアルタイムで生成されたキーワード「python」を含むツイートを処理するためにそれを使用する方法を示しています。

8.2 リアルタイムのソーシャルメディアデータ分析

リアルタイムデータストリーミングと分析ツールを組み合わせることで、リアルタイムのソーシャルメディアデータ分析が可能になります。簡単な例を次に示します。

from collections import Counter 

# 统计实时推文中关键词的频率
keyword_counter = Counter()

class MyStreamListener(StreamListener):
    def on_status(self, status):
        keywords = ['data', 'analysis', 'python']  # 示例关键词
        for keyword in keywords:
            if keyword.lower() in status.text.lower():
                keyword_counter[keyword] += 1

        print(f"Real-time Keyword Frequency: {
      
      keyword_counter}")

# 创建Stream对象并启动实时数据流
my_stream_listener = MyStreamListener()
my_stream = Stream(auth=api.auth, listener=my_stream_listener)

# 过滤包含关键词的实时推文
my_stream.filter(track=keywords)

このコードは、リアルタイムデータストリームを使用して、キーワードを含むツイートの頻度をカウントする方法を示しています。

8.3 リアルタイムのセンチメント分析

from textblob import TextBlob

# 对实时推文进行情感分析
class MyStreamListener(StreamListener):
    def on_status(self, status):
        analysis = TextBlob(status.text)
        sentiment = 'Positive' if analysis.sentiment.polarity > 0 else 'Negative' if analysis.sentiment.polarity < 0 else 'Neutral'
        print(f"{
      
      status.user.screen_name}: {
      
      status.text}, Sentiment: {
      
      sentiment}")

# 创建Stream对象并启动实时数据流
my_stream_listener = MyStreamListener()
my_stream = Stream(auth=api.auth, listener=my_stream_listener)

# 过滤实时推文
my_stream.filter(track=['data science', 'machine learning'])

8.4 リアルタイムデータストレージ

import json

# 存储实时推文到文件
class MyStreamListener(StreamListener):
    def on_status(self, status):
        with  open('real_time_tweets.json', 'a') as f:
            tweet_data = {
    
    
                'user': status.user.screen_name,
                'text': status.text,
                'created_at': str(status.created_at)
            }
            f.write(json.dumps(tweet_data) + '\n')

# 创建Stream对象并启动实时数据流
my_stream_listener = MyStreamListener()
my_stream = Stream(auth=api.auth, listener=my_stream_listener)

# 过滤实时推文
my_stream.filter(track=['python', 'programming'])

上記はtweepy-streaming、リアルタイムセンチメント分析とリアルタイムデータストレージを含む、このペアのさらなる拡張です。

要約する

この記事では、複数のソーシャルメディア分析ツールを体系的に紹介し、読者に深い学習のための基礎を提供します。これらのツールを学習することで、読者はソーシャルメディアデータを簡単に取得し、ユーザーの行動を分析し、リアルタイムのデータストリーム処理を実行し、データ視覚化ツールの助けを借りて深い洞察を提示することができます。これは、マーケティング、世論分析、社会動向調査などの分野に従事する専門家だけでなく、ソーシャルメディアデータマイニングに興味のある学習者にとっても実用的な価値があります。