This project uses two DynamoDB tables and an S3 bucket:
-
YouTubeChannelMonitor: Stores information about the YouTube channels being monitored.
- Partition key:
channel_name
(String) - Sort key:
channel_id
(String)
- Partition key:
-
VideoProcessingHistory: Keeps track of processed videos to avoid duplicate processing.
- Partition key:
video_url
(String) - Sort key:
processed_at
(String)
- Partition key:
-
S3 Bucket: Stores the generated videos.
- Bucket name:
recall-bot-ig-reel
- Bucket name:
To configure the S3 bucket for video uploads, modify the bucket_name
parameter in the process_and_generate_video
function call in pipeline.py
.
Life Cycle Rule: Set up a life cycle rule to delete objects after a certain period to manage storage costs.
To set up these tables in your AWS account, follow these steps:
- Log in to the AWS Management Console.
- Navigate to the DynamoDB service.
- Click on "Create table" for each of the following tables:
- Table name: YouTubeChannelMonitor
- Partition key: channel_name (String)
- Sort key: channel_id (String)
- Table name: VideoProcessingHistory
- Partition key: video_url (String)
- Sort key: processed_at (String)
- Leave other settings as default and click "Create".
Go to AWS DynamoDB Console, find YouTubeChannelMonitor table, click on Items, and add, remove, or list YouTube channels.
The VideoProcessingHistory table is automatically managed by the pipeline.py
script. It adds entries when videos are processed and checks this table to avoid reprocessing videos.
Ensure your AWS credentials are properly configured to allow the script to interact with these DynamoDB tables.
Before running the bot, make sure to configure your AWS credentials. Follow these steps:
-
Install the AWS CLI if you haven't already:
sudo apt install awscli
-
Run the following command and enter your AWS credentials when prompted:
aws configure
You'll need to provide:
- AWS Access Key ID
- AWS Secret Access Key
- Default region name (e.g., us-west-2)
- Default output format (you can press Enter to use the default)
-
Ensure that the AWS user associated with these credentials has the necessary permissions to access DynamoDB and S3.
This configuration allows the bot to interact with AWS services, including DynamoDB for storing channel and video processing information, and S3 for uploading generated videos.
Guide to get Instagram Access Token (https://www.youtube.com/watch?v=D-bF4aVByF4&t=1s)
-
Convert Personal Instagram Account to Professional Account
-
Get Facebook Business Page Id & save it
-
Link Facebook Page to Instagram Account
-
Generate Facebook Access Token using Instagram API
Permissions: all ig permissions, pages_manage_posts, pages_read_engagement
Select: get token
The access token is valid for 1 hours. To get an extended access token, check the tutorial video.
- Get Instagram Account Id
{facebook_id}?fields=instagram_business_account&access_token={access_token}
Before running the bot, make sure to set up the following folders:
-
Create an
inputs
folder in the root directory:- This folder should contain MP4 files to be used as background videos for the generated reels.
- You can download free stock videos from websites like Pexels.
-
Create an
outputs
folder in the root directory:- This folder will store the generated video reels.
Ensure these folders exist and that the inputs
folder contains at least one MP4 file before running the bot.
To use the recall-ai-bot, follow these steps:
-
Setup:
- Install the required dependencies:
pip install -r requirements.txt
- Set up your environment variables in a
.env
file. Include the following:RECALL_API_SECRET
: Your Recall API keyOPENAI_API_KEY
: Your OpenAI API keyINSTAGRAM_ACCESS_TOKEN
: Your Instagram access tokenINSTAGRAM_ACCOUNT_ID
: Your Instagram account ID
- Install the required dependencies:
-
Configure YouTube Channels:
- Edit the
monitor/monitor_list.txt
file to include the YouTube channels you want to monitor. Each line should contain the channel name and ID, separated by a comma:Channel: Bloomberg Technology, UCrM7B7SL_g1edFOnmj-SDKg Channel: Andrew Huberman, UC2D2CMWXMOVWx7giW1n3LIg
- Edit the
-
Run the Bot:
- Execute the main script:
python pipeline.py
- The bot will perform the following actions: a. Check for new videos from the specified channels b. Process new videos to generate summaries c. Create Instagram Reel videos from the summaries d. Upload the videos to an AWS S3 bucket e. Post the videos as Instagram Reels
- Execute the main script:
-
Output:
- Generated videos will be saved in the
outputs
folder asreel_output_p{part_number}.mp4
- Summaries and other data will be saved in the
operation_data
folder
- Generated videos will be saved in the
-
Customization:
- You can adjust video generation settings in the
autoeditor/generator.py
file:# autoeditor/generator.py # startLine: 8 # endLine: 95
- You can adjust video generation settings in the
-
Monitoring:
- To change the time frame for checking new videos, modify the
hours_ago
parameter in theget_latest_videos_rss
function:# monitor/search.py # startLine: 7 # endLine: 30
- To change the time frame for checking new videos, modify the
-
Instagram Upload:
- The bot automatically uploads generated videos to Instagram using the
reel_upload.py
script.
- The bot automatically uploads generated videos to Instagram using the
Note: Ensure you have the necessary permissions and comply with YouTube's and Instagram's terms of service when using this bot.
The recall-ai-bot operates through the following process:
-
YouTube Channel Monitoring:
- The bot starts by executing the
pipeline.py
code. - It uses the
monitor
module to check if any YouTube channels listed inmonitor_list.txt
have uploaded new videos within the last 24 hours. - The module returns a list of YouTube URLs for new videos.
- The bot starts by executing the
-
Video Processing:
- The bot iterates through the list of URLs.
- For each URL, it uses the
recall_api
module to generate: a. A summary script b. An Instagram Reel caption - Both are returned in JSON format.
-
Video Generation:
- The
autoeditor
module is used to convert the script into one or more Instagram Reel videos. - If a script is too long for a single Reel, it's automatically split into multiple parts.
- The
-
Upload and Posting:
- Generated videos are uploaded to an AWS S3 bucket.
- The S3 URLs are then used to post the videos as Instagram Reels.
This automated process allows for efficient creation and distribution of content, transforming YouTube videos into engaging Instagram Reels with minimal manual intervention.
-
pipeline.py: The main orchestrator of the entire process. This script coordinates the entire workflow, from monitoring YouTube channels to generating and processing videos.
-
monitor/search.py: Handles YouTube channel monitoring and video retrieval. This module checks for new videos from specified YouTube channels and retrieves the most viewed recent videos.
-
recall_api/workflow.py: Processes video URLs to generate structured summaries. This component interacts with the Recall API to fetch and process video data, generating enhanced summaries.
-
autoeditor/generator.py: Converts scripts into video content. This module handles the generation of video content from the processed scripts, including audio synthesis and video editing.
-
autoeditor/editor.py: Manages video editing and composition. This class is responsible for combining various elements (background video, subtitles, cover images) into the final video output.
max_videos = 10 # Maximum number of videos to process in one run
This setting limits the number of videos that will be processed in a single execution of the script. You can adjust this value in the pipeline.py
file if you want to process more or fewer videos per run.
Run prior setup and try running the script in the server.
Schedule script to run on AWS EC2 server every 24 hours using cron.
crontab -e
add this line:
0 0 * * * /usr/bin/python3 /home/ubuntu/recall-ai-bot/pipeline.py >> /home/ubuntu/recall-ai-bot/pipeline.log 2>&1
It should run ever 24 hours at midnight on the server local time.
Note: AWS might have risk of ig page being suspended as per our experimentation.
- to manage autogenerate video audio srt output files path go to autoeditor/generator.py line 69 74.
- to improve, make the audio piece not generated at outputs folder
- strange temp out reel_output_p1TEMP_MPY_wvf_snd generated in root folder and is auto deleted
control the json file input in workflow.py
autoeditor/generator.py line 50
pipeline.py line 25 char_limit and upper_limit
To ensure that only substantial content is processed into videos, a minimum character count check has been implemented. Videos with summaries shorter than the specified character count will be skipped. You can adjust this setting in the pipeline.py
file:
# Set the minimum character count for video generation
min_char_count = 800 # You can adjust this value as needed
This feature helps to filter out videos that might not have enough content for a meaningful summary or Reel.
change the prompt in recall_api/gpt_summary.py
The Instagram Reel upload process is handled by the reel_upload.py
script. To adjust the upload settings or modify the caption format, edit this file.
Videos are now stored in S3 with unique identifiers based on the YouTube video ID. This prevents overwriting when uploading multiple series of video summaries. The naming convention is videos/{youtube_video_id}_p{part_number}.mp4
.
To mimic human behavior and reduce the risk of being identified as a bot, the script implements a random delay between video uploads. This delay is set between 10 seconds to 10 minutes. You can adjust this range in the pipeline.py
file:
delay = random.uniform(10, 600) # Random delay between 10s to 10 minutes
This feature helps to make the upload pattern less predictable and more human-like.