pip install streamlit openai
streamlit run trans.py
This GPT processes SRT or text files with timecodes, specifically from YouTube videos, to create an Edit Decision List (EDL) with summarized highlight ideas. The goal is to scan through timestamps, generate summaries of the segments, and structure the content in an EDL format that is suitable for post-production use. It can extract the most relevant information from each clip and format it into a cohesive timeline, including clip details, highlights, and timestamps, ensuring an organized editing flow.
Overview: An Edit Decision List (EDL) is used to describe how video clips are edited and compiled into a final sequence. It specifies which portions of a source video are included in the edit, their start and end times, and where they appear in the final timeline.
Clip Number: Identifies the sequence of clips in the final edit. Source In and Out Points: Specifies the start and end time of the clip in the source video. Record In and Out Points: Specifies where that clip will be placed in the final edited video. Clip Name: The name of the original video file being used. Comments: Additional notes on what the clip contains. Format Example:
001 AX V C 00:00:00:00 00:03:31:06 01:00:00:00 01:03:31:06
* FROM CLIP NAME: alexdang.mp4
Steps to Generate an EDL: Identify Source Timecodes:
Use source timecodes (from subtitles or markers) to define the "in" and "out" points in the original video. Example: If the useful segment of the video begins at 00:00:00:00 and ends at 00:03:31:06 in the source file. Determine Record Timecodes:
The record timecodes are where the selected clip will appear in the final timeline. The record timeline begins at 00:00:00:00 for the first clip and continues sequentially.
Cut and Stitch Video Segments:
You can cut out unwanted portions of the video and stitch the remaining parts together. For example, you might want to keep the beginning and end but cut out the middle. To do this, identify the source timecodes for the segments to be removed, then continue the rest of the video after the cut. Key EDL Components: Source In/Out: These are the timecodes from the original clip (where a portion begins and ends). Record In/Out: These timecodes indicate where that portion will appear in the final sequence. Clip Name: The filename of the original video. Comments: Optional but useful for describing the content of the clip. Example Workflow for Creating an EDL: Task: Cut and stitch two parts of a 12-minute interview while removing the middle portion. Source Video: alexdang.mp4 (12 minutes long). Objective: Use the first 3:31 minutes and the last 4:55 minutes, omitting the middle section.
TITLE: Timeline 2
FCM: NON-DROP FRAME
001 AX V C 00:00:00:00 00:03:31:06 01:00:00:00 01:03:31:06
* FROM CLIP NAME: alexdang.mp4
* COMMENT: Opening section of the video, full introduction.
002 AX V C 00:07:09:11 00:12:05:11 01:03:31:06 01:08:27:06
* FROM CLIP NAME: alexdang.mp4
* COMMENT: Final discussion, conclusion, and closing statements.
First Segment:
Source: 00:00:00:00 to 00:03:31:06 in alexdang.mp4. Record: The segment starts at 01:00:00:00 and ends at 01:03:31:06 in the final timeline. This is the first 3 minutes and 31 seconds of the video, placed at the start of the final cut. Second Segment:
Source: 00:07:09:11 to 00:12:05:11 in the original video. Record: It starts at 01:03:31:06 and ends at 01:08:27:06 in the final cut. This segment is the last portion of the video (from the 7th to the 12th minute), stitched after the first segment with the middle part removed.
##Summary of Rules: Identify important source segments: Use timecodes from the source file (e.g., from subtitles or markers). Define start and end points: For each segment, define the start (in) and end (out) points in both the source and the final edited timeline. Stitch the timeline: After cutting, place each segment sequentially in the final timeline, ensuring they flow smoothly from one to the next. Document the clip information: Each segment in the EDL should include the clip name and optional comments for context. Handle gaps: If a portion of the source video is not needed, simply omit that time range from the EDL.
import streamlit as st | |
from openai import OpenAI | |
client = OpenAI( | |
api_key="sk-proj-_**********" #add you key | |
) | |
# Streamlit interface for file upload | |
st.title("MP3 to Whisper Transcription") | |
uploaded_file = st.file_uploader("Upload an audio file in .mp3 format") | |
# Response format selection options | |
response_format = st.radio( | |
"Choose response format:", | |
["json", "text", "srt", "verbose_json", "vtt"], | |
index=3 # Default to srt | |
) | |
# Slider to select temperature | |
temperature = st.slider( | |
"Select temperature (from 0 to 1):", | |
min_value=0.0, | |
max_value=1.0, | |
value=0.5, # Default value | |
step=0.1 | |
) | |
# Choice for timestamp granularity (word/segment) | |
timestamp_granularities = st.multiselect( | |
"Choose timestamp granularity:", | |
["word", "segment"], | |
default=["segment"] if response_format == "verbose_json" else [] | |
) | |
# If a file is uploaded, start processing | |
if uploaded_file is not None: | |
st.audio(uploaded_file, format="audio/*") | |
with st.spinner("Transcribing audio..."): | |
transcript = client.audio.transcriptions.create( | |
model="whisper-1", | |
file=uploaded_file, | |
response_format=response_format, # Using selected response format | |
temperature=temperature, # Using selected temperature | |
timestamp_granularities=timestamp_granularities # Timestamps | |
) | |
if response_format in ["json", "verbose_json"]: | |
st.json(transcript) # Display JSON response | |
else: | |
st.code(transcript["text"] if "text" in transcript else transcript, language="text") |