I'll have to admit early on, thinking in terms of audio signals and frequencies has never been a strong point for me so that would definitely attribute to the confusion.
Although Android docs are fairly verbose, once in a while you'll venture into a documentation jargon minefield.
Below is sample code which I got working with:
- Access to the microphone stream
- Make use of the device's noise cancellation capabilities
- Ensure that data stream is read and stored in 16bit
This makes it possible to record speech audio clearly, with minimal background noise.
public static int RECORDER_SAMPLERATE = 8000;
public static int RECORDER_CHANNELS = AudioFormat.CHANNEL_CONFIGURATION_MONO;
public static int RECORDER_AUDIO_ENCODING = AudioFormat.ENCODING_PCM_16BIT;
private void record() {
AudioRecord audioRecorder;
int bufferSizeInBytes, bufferSizeInShorts;
int shortsRead;
short audioBuffer[];
try {
// Get the minimum buffer size required for the successful creation of an AudioRecord object.
bufferSizeInBytes = AudioRecord.getMinBufferSize(
RECORDER_SAMPLERATE,
RECORDER_CHANNELS,
RECORDER_AUDIO_ENCODING
);
bufferSizeInShorts = (bufferSizeInBytes /2);
// Initialize Audio Recorder.
audioRecorder = new AudioRecord(
MediaRecorder.AudioSource.VOICE_RECOGNITION,
RECORDER_SAMPLERATE,
RECORDER_CHANNELS,
RECORDER_AUDIO_ENCODING,
bufferSizeInBytes
);
// Start Recording.
audioBuffer = new short[bufferSizeInShorts];
audioRecorder.startRecording();
isRecording = true;
while (isRecording) {
shortsRead = audioRecorder.read(audioBuffer, 0, bufferSizeInShorts);
if (shortsRead == AudioRecord.ERROR_BAD_VALUE || shortsRead == AudioRecord.ERROR_INVALID_OPERATION) {
Log.e("record()", "Error reading from microphone.");
isRecording = false;
break;
}
// Whatever your code needs to do with the audio here...
}
}
finally {
if (audioRecorder != null) {
audioRecorder.stop();
audioRecorder.release();
}
}
}
Access to the microphone stream: Short Byte
I wanted to keep the audio stream low in bitrate, but just a word of warning, you can't trust AudioFormat.ENCODING_PCW_8BIT. If a device doesn't support 8BIT, it'll be unhappy with getMinBufferSize().
From the docs:
AudioFormat.ENCODING_PCM_8BIT: Audio data format: PCM 8 bit per sample. Not guaranteed to be supported by devices.
AudioFormat.ENCODING_PCM_16BIT: Audio data format: PCM 16 bit per sample. Guaranteed to be supported by devices.
The other thing you have to calculate is the minimum buffer size. Tricky thing is the returned value is in bytes (8bit), not shorts (16bit). This is important as our audio comes in 16bit quality.
To confirm, we can check by looking at the size of the Byte and Short data types:
Byte.SIZE can only store 8 bits
Short.SIZE can store 16 bits
So to calculate the buffer size in shorts; 16bit / 8bit = 2
Which is how this magic line of code came to be.
bufferSizeInShorts = (bufferSizeInBytes /2);
Bonus tip: Handy snippets to convert between byte[] to short[] data types.
To convert short[] to byte[]:
byte[] byteBuffer = new byte[shortsRead *2];
// Copy a smaller version of the buffer
short x[] = new short[shortsRead];
for (int i = 0; i < shortsRead; i++) {
x[i] = audioBuffer[i];
}
ByteBuffer.wrap(byteBuffer).order(ByteOrder.BIG_ENDIAN).asShortBuffer().put(x);
The reason why we need to create a new temporary short[] called "x" is because not all values in audioBuffer will be filled by by AudioRecorder. The unfilled slots will cause exceptions when ByteBuffer tries to convert it.
To convert from byte[] to short[]:
Thankfully this is much shorter.
ByteBuffer.wrap(audioBuffer).order(ByteOrder.BIG_ENDIAN).asShortBuffer().get(shortBuffer);
Ensure that data stream is read and stored in 16bit
Now you might be wondering why I spent so much time writing up about silly bytes and shorts.
Well even though you're streaming audio data at 16bit, you can still access it 8bit at a time by fetching it out as bytes rather than shorts.
Now this bit is vital. In tandem with noise cancellation, we want to also improve the perceived audio quality for the user. By using of only 8bits of the 16bit audio data we are damaging the audio source, causing the sound to become fuzzy and unclear.
Be sure to use the right AudioRecord.read() function:
- public int read (short[] audioData, int offsetInShorts, int sizeInShorts)
- public int read (byte[] audioData, int offsetInBytes, int sizeInBytes)
Now remember what we learned about bytes and shirts from the last section?
You need short shorts!
Given that audioBuffer is a short[], this will ensure that you'll call the right read() function.
shortsRead = audioRecorder.read(audioBuffer, 0, bufferSizeInShorts);
Make use of the device's noise cancellation capabilities: Audio source
Looking briefly at the AudioRecord setup:
audioRecorder = new AudioRecord(MediaRecorder.AudioSource.VOICE_RECOGNITION, ...)
You might be wondering why I didn't just use MediaRecorder.AudioSource.MIC rather than the oddly named MediaRecorder.AudioSource.VOICE_RECOGNITION.
Well at first I did use MIC, until I found out the hard way that MIC will record EVERYTHING. Soft taps on the table or keyboard, rustling of a plastic bag, someone coughing in the background, etc.
You'll hear everything. Every. Single. Thing.
For my app in particular I needed to record speech WITHOUT all the distracting background fuzz. Using VOICE_RECOGNITION will pass the audio data into the phone's noise cancellation filters before handing it over to you.
So unless you're comfortable with implementing your own noise cancelling algorithm, it's much easier to just let the phone's built-in hardware handle it.
Another way to reduce noise is by using the AudioEffect NoiseSuppressor class available with Jelly Bean (Android 4.1 API level 16). I haven't looked into this very much as I still need to support Android 2.3.
The last method of noise cancellation is to use AudioManager.setParameters() which is available in API level 5+.
am = (AudioManager) activity.getSystemService(Context.AUDIO_SERVICE);
am.setParameters("noise_suppression=auto");
I'm uncertain about effective this is as there's little to no feedback on it's success status or whether manufacturers have even implemented it at all.
I have little confidence by looking at how little details there is on the AudioManager.setParameters() documentation. Even the source code for setParameters() doesn't say anything useful. If you were keen on using it, use it in conjunction with the methods above.
I trust setParams() as much
as I trust this pole in a storm.
After the explanation, I believe the rest of the sample code is fairly straight forward. The method simply loops until the audio stream fails to read, much like reading from a socket or file. Lastly, remember to clean up after yourself by closing the stream and everything should be happy.
As always, feel free to donate (link below) if you found this information helpful!
Sources
If you feel like checking out the various sources of information I looked at to learn all that, feel free to check out the links below.
- audio - Android AudioRecord questions? - Stack Overflow
- how to amplify the frequency of audio in android or reduce the background noise - Stack Overflow
- MediaRecorder.AudioSource | Android Developers
- microphone - Query noise level in Android - Stack Overflow
- java - Android - using AudioManager to play/pause music? - Stack Overflow
- NoiseSuppressor | Android Developers
- Android Bridge: How to turn on noise suppression in Android
- android - How avoid automatic gain control with AudioRecord? - Stack Overflow
- Speex: a free codec for free speech
- audio - Android AudioRecord questions? - Stack Overflow
- byte array to short array and back again in java - Stack Overflow