Voice Editor Capstone

April26

Enhancements over the weekend.

I put a ton of effort into adding the small things that make this a much more whole product. I finally added the ability for users to fine-tune a selected portion of audio without having to completely reselect their desired portion. I cleaned up most of the artefacts from selection markers not erasing properly and added animations for tracking what part of the audio clip is playing. A few final things on the to-do list include increasing the default available audio library from my voice (possibly adding others), fleshing out user options for tweaking the final phrase, and limiting the length of some user-defined filenames (who knew that 260 chars for a filename was excessive?!).

April17

Barking up some wrong trees.

I have spent a significant amount of time looking into ways to increase the quality of combined voice clips - the general understanding from my research is that human voices are incredibly intricate and the inflections and nuances are nearly impossible to accurately mimic. Another issue I've been running into is how hard it is to save a useful bit of audio from a conversation. When we talk, our words run together - we frequently start flowing right into the next word or phrase making saving a single word nearly impossible in some situations without it being 'contaminated' by the surrounding speech. As an example, I tried recording the phrase "A new recording" but it was impossible to grab a good sample of the word "a" since my pronunciation went something like "anew recording".

In other news, I created what I think is a really nifty feature - users can select the person's voice they'd like to use, type in a phrase, and the program will do its best to construct that phrase from the user's voice files, if they are available. Sometimes the phrasing is still disjoint, given the inflections, odd pauses (or lackthereof) between words, etc. but the user has the ability to clean up the resulting phrase as they wish.

April12

Overhauled Design, Working Engine

I have a working engine! The nAudio library is an excellent resource for implementing basic audio functionality. I heavily modified several of the functions to act as my custom waveform viewer and tied in the capability of determining the location within the wave, which is essential for cutting voice snippets and playing back portions of audio. I also overhauled the design - instead of an arbitrary number of waveform objects I decided on three, spanning across the form one over the other. Each one has a dedicated purpose: the top is for recording new audio clips and working with / cutting raw data, the middle is for fine tuning a trimmed audio clip and re-trimming for desired effect, and the last is for piecing together entire phrases.

February23

Quiet week for an audio project

This past week I did not get very far unfortunately. I met with Dr. Pankratz early in the week to discuss my direction with the application and what he saw as the key goals to work toward at this point. I do have some idea of how the UI will be laid out and Dr. Pankratz brought up how I might store the audio files and, in particular, differentiating between the audio files of several different voices which I hadn't yet given much thought to.

February16

Waveform and audio playback

This weekend I spent some time playing around with wave files in C#. I've found NAudio - an open source .NET audio library - which has several components useful for manipulating audio in C#. I have altered one component for displaying waveforms to be more robust and allow for selecting portions of the loaded file.

Audio Waveform Display

At this point my goal is to put together a more useful system for loading audio files and being able to work with several files at once. I am currently working on the ability to play selected portions of a loaded waveform as well as working the waveform display controls into a useful user interface solution.

February08

Website up

I've put far more work into creating this website than originally intended. I was simply going to grab a template, fill out some content, and throw it up on the server (easy, right?) but then I started tweaking this and that to make it all come together. In the end I think it looks nice and I like how the information is presented, so it was worth it.

February02

Project assignment and some ideas

After receiving my senior capstone project assignment I have done some cursory research on topics related to developing a voice editing application and collected resources on plotting waveforms, speech recognition, and audio handling in C# - the language I've decided to use for this project. My initial goal is to develop a simple prototype - explore the the functionality of various options for audio manipulation in C#.

Audio Clip Control Sketch

I am thinking one of the first steps I will take is to attempt to design a custom control for displaying and manipulating an audio clip. Ideally something similar to my sketch above - a displayed waveform and various controls for editing and modifying the given audio sample.

Voice Editor