In this tutorial we will focus on loading audio from file (almost any type of audio file format) and saving it back to file (almost any audio file format).
Before we start, you have to have compiled and installed gtkIOStream. For certain components of gtkIOStream its compilation and installation isn't necessary, but for certain parts of audio processing it is necessary. Tutorial 0 guides you through how to install and test the installation of gtkIOStream. Before you start this tutorial, you must have it installed first.
Our first step is to open the standard lightweight Raspberry Pi IDE with a blank file for us to code up in ... like so :
Code: Select all
geany SoxInOut.C
Code: Select all
ssh -X pi@raspberrypiIP
Code: Select all
#include <Sox.H>
#include <iostream>
int main(int argc, char *argv[]) {
if (argc<3){ // input check
cout<<"Usage:\n"<<argv[0]<<" audioFileNameIn audioFileNameOut"<<endl;
return -1;
}
Sox<int> sox; // declare our sox object reading/writing int (32 bits)
int ret; // return variable on error
if ((ret=sox.openRead(argv[1]))<0 && ret!=SOX_READ_MAXSCALE_ERROR)
return SoxDebug().evaluateError(ret, argv[argc-2]);
Eigen::Array<float, Eigen::Dynamic, Eigen::Dynamic> x; // the buffer
int N=100*1024; // number of frames to load in each time
ret=sox.read(x, N);
if ((ret=sox.read(x, N))!=0) // read in again
return SoxDebug().evaluateError(ret);
// Now that we know how many channels the file is, open the output file
if ((ret=sox.openWrite(argv[2], sox.getFSIn(), x.cols(), pow(2.,31)-1)))
return SoxDebug().evaluateError(ret);
// main loop, write and read
int i=0; // remember how many times we have looped
while (x.rows()!=0){ // wait till no more read in
// process your audio here
if ((ret=sox.write(x))!=x.rows()*x.cols())
return SoxDebug().evaluateError(ret);
cout<<'\r'<<((float)(++i*N)/sox.getFSIn())<< " seconds written";
if ((ret=sox.read(x, N))!=0) // read in again
return SoxDebug().evaluateError(ret);
}
sox.closeWrite();
sox.closeRead();
return 0;
}
The first step is to include our Sox and iostream headers. We then enter our "main" function. Like so :
Code: Select all
#include <Sox.H>
#include <iostream>
int main(int argc, char *argv[]) {
Code: Select all
if (argc<3){ // input check
cout<<"Usage:\n"<<argv[0]<<" audioFileNameIn audioFileNameOut"<<endl;
return -1;
}
We instantiate our Sox object. Now at this point we have to tell sox what type we want it to use for reading and writing audio.
Code: Select all
Sox<int> sox;
Next we want to open the input file, the read file :
Code: Select all
int ret; // return variable on error
if ((ret=sox.openRead(argv[1]))<0 && ret!=SOX_READ_MAXSCALE_ERROR)
return SoxDebug().evaluateError(ret, argv[argc-2]);
We now make an buffer to hold audio data (perhaps you want to do computations on that buffer before writing back out to file) :
Code: Select all
Eigen::Array<float, Eigen::Dynamic, Eigen::Dynamic> x; // the buffer
We have to specify how many frames of audio we want to process each time (audio frames are one sample per channel - i.e. one frame can be one sample of two channels), we do that like so :
Code: Select all
int N=100*1024; // number of frames to load in each time
ret=sox.read(x, N);
if ((ret=sox.read(x, N))!=0) // read in again
return SoxDebug().evaluateError(ret);
Now we know the sample rate of the input audio file (the sox.getFSIn method), the number of channels (x.cols) and we can open the output file for writing :
Code: Select all
if ((ret=sox.openWrite(argv[2], sox.getFSIn(), x.cols(), pow(2.,31)-1)))
return SoxDebug().evaluateError(ret);
We now enter the read write loop like so :
Code: Select all
int i=0; // remember how many times we have looped
while (x.rows()!=0){ // wait till no more read in
At this point if you want to process the audio data, you should ... we choose not to !
We write the audio to the output file :
Code: Select all
if ((ret=sox.write(x))!=x.rows()*x.cols())
return SoxDebug().evaluateError(ret);
Next we print out how many seconds we have processed and read in more data for processing :
Code: Select all
cout<<'\r'<<((float)(++i*N)/sox.getFSIn())<< " seconds written";
if ((ret=sox.read(x, N))!=0) // read in again
return SoxDebug().evaluateError(ret);
At this point we loop and loop - processing - until we run out of audio samples to process. We then close the audio files which we opened and return :
Code: Select all
sox.closeWrite();
sox.closeRead();
return 0;
Thats it !
We need an audio file for testing, so lets download a free one :
wget "https://ogg.jamendo.com/download/track/206411/ogg1/" -O flatmax.CentralTransmission.ogg
Compile your code :
Code: Select all
g++ -o SoxInOut SoxInOut.C `pkg-config --cflags --libs gtkIOStream`
We compile and create the output file SoxInOut. It will print a couple of warnings like so :
Code: Select all
$ g++ `pkg-config --cflags --libs gtkIOStream` -o SoxInOut SoxInOut.C
In file included from SoxInOut.C:1:0:
/usr/include/gtkIOStream/Sox.H:27:32: warning: unknown option after ‘#pragma GCC diagnostic’ kind [-Wpragmas]
#pragma GCC diagnostic ignored "-Wignored-attributes"
^
Code: Select all
$ ./SoxInOut flatmax.CentralTransmission.ogg flatmax.CentralTransmission.wav
Thats it ! You can now process audio files of many many different formats with your own algorithms. Here is a list of some of the audio formats handled
by the Sox object :
Code: Select all
AUDIO FILE FORMATS:The known output file extensions (output file formats) are the following :
8svx aif aifc aiff aiffc al amb amr-nb amr-wb anb au avr awb caf cdda cdr cvs cvsd cvu dat dvms f32 f4 f64 f8 fap flac fssd gsm gsrt hcom htk ima ircam la lpc lpc10 lu mat mat4 mat5 maud mp2 mp3 nist ogg paf prc pvf raw s1 s16 s2 s24 s3 s32 s4 s8 sb sd2 sds sf sl sln smp snd sndfile sndr sndt sou sox sph sw txw u1 u16 u2 u24 u3 u32 u4 u8 ub ul uw vms voc vorbis vox w64 wav wavpcm wv wve xa xi