User:Commander Keane/Audio workflow

This page is about my workflow for adding audio files to Wiktionary. In a nutshell it has two purposes - to answer the question "How do you do it so fast?" and to remind myself what the steps are and what the code is.

Capture

I use an Audio-Technica ATR2100-USB microphone. It is described as cardioid dynamic. It costs about US$64 and includes a desk stand. I use it in USB mode.

I also have a gooseneck pop filter that works with the desk stand.

I run MS Windows as my operating system. I use the proprietary digital audio workstation software Reaper to "treat" the audio as I record on the fly. I use VB-Cable Audio Virtual Cable to channel the audio through Reaper and into Python.

In Reaper I have a few FX filters set up, mainly following Elijah Lucian's Reaper for Voice Actors videos. The main difference is that I don't use a noise gate but prefer noise subtraction utilising ReaFir. Although I have Reaper doing fancy stuff like compression, noise reduction and EQ but its main purpose is to boost the volume of the recordings.

Unfortunately I don't have a sound booth and I record in my bedroom in a suburban household. When I record I do turn off the air conditioning and ceiling fan. As it is summer here, the final piece of "kit" is a cloth to mop up the sweat that drips onto the keyboard.

Step 1 - generating lists

The 4 steps. I simply right click and click "Run with PowerShell" to get going
Generated lists

This first step is getting a list of entries to record. I simply right click and "Run with PowerShell" the Step 1 - generate file lists icon.

Before I do that I can change a couple of settings in xFileGenBot10.py:

recurseGlobal = 0 (where 0 is the category depth), and
lines_per_file = 200 (where 200 is the amount of entries to work on in a batch. I find 200 is a nice balance.)

Once I type in the category it saves a series of text files.

Step 2 - recording

When running, the program gives me various options whilst displaying the current entry to have its pronunciation recorded.
I can pause the program (to clear my throat, or re-take a pronunciation). It doesn't really pause the recording, but stops recording and queues up the last entry for another recording attempt.

For the next 3 steps to work easily I simply copy the filename of the list I am working on into the first line of the "Generated lists.txt". Eg I would place "English idioms_sf_200" at the top and now run step 2.

Before I invest my time in a big batch of recording, I always delete the "Mic Test.wav" file and then double click the "Test the microphone.bat" to check if my mic is turned on and recording.

It takes about 10 minutes to record a batch of 200 entries.

Step 3 - wav to ogg

Audacity showing the waveform with the files listed on the left.

Step 3 takes the wav files saved in Step 2 and converts them to ogg. It also adds the "en-au-" prefix. Importantly this step creates a a new audio file which includes all recordings separated by a tone. In Audacity I then listen to the file (and watch the waveforms) for quality assurance - cut-off audio, clipping, and interruptions.

Step 4 - upload

Uploading to Commons is fully automated.

Step 5 - adding links

In this case a "Pronunciation" section already exists so the program suggests putting the audio template after the IPA stuff
In this case there was no "Pronunciation" section so the program creates one

The final steps is to add links to the audio files in Wiktionary entries. My tool suggests a place for the audio template to go and then displays the modified wikitext in a pop up. I then click "ok" or move the template to the correct place according to Wiktionary:Entry_layout.

To get the list of ogg's to link, I use xPutinWiki7.py and specify (at about line 105) the start and end time of the creation of the first and last upload as seen in my Commons contributions page.

I can add about 5-7 links to Wiktionary per minute. So a batch of 200 can be wrapped up in ~40 minutes. This makes it the most time consuming part of the entire process.

Commands to run:

cd c:\users\jim\pywikibot
python3 pwb.py xPutInWiki7

Tech stuff

Need to install:

pip install pyaudio
Install sox - pip wasn't working for me so I installed sox as a normal program and added it to my path
ffmpeg (Windows tutorial)
Don't forget to add the tone to C:\Users\jim\pywikibot (tone.ogg)

TODO

Unicode compatibility with Step 5 (adding links - xPutInWiki7)
Command shortcut for step 5
Minute by minute edit breakdown for Commons uploads and en.wikt edits.
- Automate the start and end times in Step 5
- Check Commons for unused audio files
Clean up xPutInWiki7 (console output / remaining counter)
Step 2 - recording - include remaining counter
Check uploaded files for ogg validation errors