We’ve now prepared our visual and audio elements. It’s time to combine these into a cohesive video. For this task, we’ll use a python library called Moviepy. MoviePy is a powerful Python library that provides a user-friendly interface for video editing tasks. With MoviePy, you can easily manipulate videos, audio, and images to create professional-looking videos. In this blog post, we’ll explore how to use MoviePy to combine images and audio into a simple video.
Setting Up the Project:
Before we dive into the code, make sure you have MoviePy installed in your Python environment. You can install it using pip:
pip install moviepy
Importing Necessary Modules:
from moviepy.editor import ImageClip, concatenate_videoclips, AudioFileClip
This line imports the required modules from the MoviePy library:
ImageClip: Used to create video clips from images.concatenate_videoclips: Used to join multiple video clips together.AudioFileClip: Used to load and manipulate audio files.

Setting Paths:
images = ["./video_images/vid_image1.png", "./video_images/vid_image6.png", "./video_images/vid_image7.png"] audio_path = "./voiceover/speech1.mp3" output_video_path = "output_video.mp4"
This code defines the paths to the images, audio file, and the output video file. You can customize these paths to match your file structure.
Loading Audio File:
audio = AudioFileClip(audio_path) audio_duration = audio.duration # Get the duration of the audio
- This code loads the audio file specified in 
audio_pathusing theAudioFileClipclass. - The 
audio_durationvariable stores the duration of the audio in seconds. 
Calculating Image Duration:
image_duration = audio_duration / len(images)
This code calculates the duration for each image by dividing the total audio duration by the number of images. This ensures that the images are displayed for a suitable duration to match the audio.
Creating Image Clips:
clips = [ImageClip(img).set_duration(image_duration) for img in images]
- This line creates a list of 
ImageClipobjects, each representing an image from theimageslist. - The 
set_durationmethod is used to set the duration of each image clip to the calculatedimage_duration. - This list comprehension efficiently creates the clips, making the code more concise.
 

Combining Clips:
video_clip = concatenate_videoclips(clips, method="compose")
- This line combines all the image clips created in the previous step into a single video clip using the 
concatenate_videoclipsfunction from MoviePy. - The 
method="compose"argument specifies the method used for combining the clips. In this case, it uses the “compose” method, which overlays the clips on top of each other. You can explore other methods like “chain” for transitioning between clips. 
Setting Audio & FPS:
video_clip = video_clip.set_audio(audio) fps = 24
- The first line adds the previously loaded audio to the video clip. The 
set_audiomethod associates the audio object with the video_clip. - The second line sets the frames per second (FPS) for the output video. FPS determines the smoothness of the video playback. A higher FPS generally results in smoother video. You can adjust the 
fpsvalue to your desired frame rate. 
Exporting the Video:
video_clip.write_videofile(output_video_path, codec="libx264", audio_codec="aac", fps=fps)
- This line exports the final video to the specified 
output_video_pathusing thewrite_videofilemethod. - The codec arguments specify the video and audio codecs to use. In this case, 
"libx264"is a popular video codec known for its quality and efficiency, while"aac"is a widely used audio codec. - The 
fpsargument sets the frame rate of the output video, matching the previously specified value. 
Following these steps will allow you to seamlessly merge image clips and audio, producing a cohesive video with synchronized visuals and sound. The result is a polished multimedia presentation that effectively combines all elements.
