Everyone says start small. My first real project was not small at all. But it was mine, and it taught me more than anything I had tried before. So this is where my story starts.
One quick thing about me first, since this is the beginning of the whole blog. I am self taught. I never studied programming properly. What I do is build things by working with AI, telling it exactly what I want, testing it, breaking it, and fixing it until it works. Some of what I make is good. Some of it fails. I am going to show you all of it, honestly, including the actual technical decisions, the tools, and the code choices, because that is the part people can actually learn from. This first project is a video tool, and here is why I made it and how I built the first version.
The thing that annoyed me
I kept watching those movie recap videos. You have seen them. Someone takes a full movie, cuts it into short clips, adds a voice over, and turns it into something you watch in two minutes on your phone. They do well. I wanted to make them too.
Then I tried to make one by hand, and that was the end of the fun. To make a single recap you have to watch the whole movie, find the good moments, cut each clip one by one, line them up, and export. Hours of boring work for one video. I am one person. There was no way I could do this again and again.
So I thought the way I always think when something wastes my time. The computer should do this for me. If a program could open a movie, find where the scenes change on its own, and pull out short clips automatically, then the boring part would be gone and I could just do the creative part.
What I decided to build it with
The first real decision was the tools, and I want to be specific, because these choices shaped everything.
I chose Python as the language. Python is friendly, forgiving, and fast to build with, which is exactly what a self taught person needs. You can get something working without fighting the language itself.
For the interface, the actual window with buttons and dropdowns and sliders, I used Tkinter. Tkinter is built into Python, so there is nothing extra to install, and it makes proper desktop windows. It is not the prettiest tool in the world, but it runs anywhere Python runs, and for a first desktop app that mattered more than beauty.
For the heavy lifting on the video itself, the actual cutting and converting, I used FFmpeg. FFmpeg is the engine behind a huge amount of the video software you already use. It is free, it is powerful, and it does the real work of slicing a clip out of a movie or changing its size and format. My program was, in a sense, a friendly face on top of FFmpeg, sending it the right instructions based on what I clicked.
I built the program in an object oriented way, meaning the whole app lived inside a class that held all its parts together, the settings, the buttons, the processing. That kept things organised instead of becoming a loose pile of code.
What I wanted it to do
Before writing much, I wrote down every feature I wanted, and it was a long list.
I wanted to pick a video file and choose where to save the clips. I wanted control over the output, so resolution, aspect ratio, output format, and bitrate, with an option to just keep the original settings when I was in a hurry. I wanted scene detection, so the tool could find the cuts in a movie on its own, and a slider to set the longest a scene could be, somewhere in the range of five to thirty five seconds. I wanted to choose how long the extracted clips would be.
At that early stage I went even further. I wanted the tool to transcribe the audio and give me a quick summary, so I planned to use Whisper, which is a speech to text system that listens to audio and writes out the words. And I wanted start, pause, and resume buttons, plus a progress bar, so a long job would not just freeze the window or crash without explanation.
Looking back, that was far too much for a first version. But at the time I wanted everything.
How the first version actually worked
A few technical pieces made the first version function, and they are worth explaining simply.
Because processing a whole movie takes time, I could not let it run on the main part of the program, or the window would freeze completely while it worked, looking like it had crashed. So I ran the heavy processing on a separate thread, which is just a way of doing a long job in the background while the window stays awake and responsive. That is also what let the progress bar update as it went.
I added a check at startup that looked for the tools the program depended on, like FFmpeg, before it tried to run, so that if something was missing it could say so clearly instead of failing in a confusing way later.
And I gave it a dark theme, a dark blue background, because I wanted it to look like real software, not a plain grey school project.
The first time I opened it and watched it actually pull clips out of a real movie, that feeling is hard to explain. I built that. No degree, just stubbornness, a clear idea, and a willingness to keep testing until it worked.
Where it went wrong
Here is the honest part. It worked, but it was a mess. I had stuffed in every feature I could think of, including ones I did not really need, like the Whisper transcription, which dragged in heavy extra software and slowed everything down. The screen was crowded with options. One slider would not even show its value while I dragged it, so I could not tell what I was setting. It looked busy and felt heavy.
I did not realise it yet, but I had made the most common beginner mistake. I thought more features meant a better tool. They do not. A tool that tries to do everything usually does nothing well, and it confuses the person using it, even when that person is you.
Next
So I had a working tool that was also a cluttered, heavy mess. The next step was not adding more. It was the opposite. I had to look at my own creation, admit it was ugly, and start cutting things away, starting with that heavy transcription feature. That is the next post, where I rebuilt the whole look and learned that deleting features, and the dependencies behind them, can make software better.
What is FFmpeg, and why use it instead of a ready made library?
FFmpeg is a free, powerful program that handles almost any video or audio task, cutting, converting, resizing, and more. Most video software is built on top of it. Using it directly meant my program just had to send it the right instructions, which gave me far more control than a simpler ready made library would have.
Why Tkinter for the interface?
Because it comes built into Python, so there is nothing extra to install, and it makes real desktop windows that run anywhere Python runs. It is not the most beautiful tool, but for a first desktop app, simple and dependable beats pretty.
What does running the work on a separate thread actually mean?
It means doing the slow job, processing the video, in the background, so the window stays awake and responsive instead of freezing. Without that, a long task makes the whole program look frozen and crashed, even though it is just busy.