Twitter scraper
Scrape user's tweets :D
Usage:
Unauthenticated
Example:
scraper = TweetScraper()
tweets = scraper.get_tweets_anonymous("<user_id>")
This will only allow use of the anonymous user tweets method, other methods will fail.
The anonymous method returns a list of tweets from the user as viewed from a logged-out session. It will only return 100 tweets (not necessarily the most recent)
Authenticated
Example:
dotenv.load_dotenv()
auth_token = os.environ["AUTH_TOKEN"]
csrf_token = os.environ["CSRF_TOKEN"]
scraper = TweetsScraper(auth_token, csrf_token)
user_id = scraper.get_id_from_handle("pobnellion")
user_tweets = scraper.get_tweets(user_id, 100)
Allows you to get tweets as a logged in user. Twitter only makes the 2000 ish most recent tweets available, but that should be more than enough.
You can either directly pass in the user id to get_tweets(), or use get_id_from_screen_name() to get the id if you don't have it.
To use dotenv, include a .env file in the directory with the following contents (no quotes around the values):
AUTH_TOKEN=<auth token>
CSRF_TOKEN=<csrf token>
You can find your auth and csrf tokens in twitter's cookies (F12 in your browser > storage tab > cookies)
The auth token cookie is called auth_token and the csrf token is called ct0
Include replies
user_id = scraper.get_id_from_handle("@pobnellion")
user_tweets = scraper.get_tweets_and_replies(user_id, 100)
This is equivalent to viewing the 'replies' tab on twitter, replies show up as Conversation objects which contain a list of tweets. The last tweet in the conversation will always be by the currently viewed user, even if there are more replies in the chain.
Tweet object
Contains the text of the tweet, along with the timestamp and some stats (like count, repost count, views, etc)
Fields:
- id : tweet id
- views : view count
- text : tweet content
- likes : like count
- replies : reply count
- retweets : retweet count
- quotes : quite tweet count
- date : post date
- is_retweet: tweet is a retweet
- is_quote: tweet is a quote tweet
- user: user who sent tweet (this is useful in conversations)
Printing a tweet object results in an overview:
L:52 RT:2 Q:1 R:3 V:1032 2025-01-20T01:53:57+00:00 Example tweet text
Conversation object
Container for a list of tweets as shown when viewing the replies tab. Does not have any other information
Fields
- items : list of tweets in the conversation
User object
Twitter user
Fields
- id : user id
- handle : user handle (without @)
- display_name :
- description :
- join_date :
- location :
- tweets_count :
- blue_verified :
- follower_count :