Archive

Archive for the ‘General’ Category

PCM Audio | Part 1: What is PCM?

January 8th, 2010

It’s been a long time since I posted anything. Most of my free time has been spent working on my ventrilo client for linux project. Of course, that project adds tons of things to discuss, such as how PCM audio works. I’m going to make this a multi-part series, because there is so much information to discuss.

When I first started working on that project, I knew nothing about how audio worked. I knew a little bit about encoders and decoders, but not really the inner workings. What are they encoding/decoding? It turns out, that the answer is PCM (pulse control modulation) audio. After messing with PCM for a few months, there are a lot of things that are painfully obvious now that were confusing. This guide is meant to be an introduction to at least give you the working knowledge you’ll need to ask proper questions and perform simple tasks. So let’s get started…

If you’ve ever used a computer MP3 player, you’ve probably seen those options to display the waveform of the audio or the little bars that pop up and down showing you treble and bass levels. What those are measuring is the PCM audio as it plays it. So what does all that crap mean?

Let’s start with the basics. There’s five terms that are important to know for PCM:

Sample Rate

Real actual audio (like someone talking to you in person) is transmitted as a wave. PCM is a digital representation of that audio wave at a specified sample rate. The sample rate is measured in Hz (cycles per second) and more often in kilohertz. So when you hear someone talk about about 128kHz vs. 160kHz audio, what they’re talking about is the sample rate. If you’ve ever done integrals in calculus, it’s a lot like that. The higher the sample rate, the better your quality (at the cost of size). There is no guessing here. You need to know what the sample rate is.

Sign

Whether the data is signed or unsigned. It is almost always signed. Treating a signed PCM stream as unsigned will hurt your ears… painfully… (I speak from experience here).

Sample Size

This determines how many bits make up one sample. 16-bit seems to be the most common.

Byte Ordering

Byte ordering refers to little-endian vs. big-endian data. If you don’t know what endian-ness means, you can probably assume little endian. If you have the option to choose endian for your data, you should always choose little-endian.

Number of channels

I’m mostly going to cover mono (1 channel), but multichannel PCM is usually handled by interleaving the PCM samples. Don’t worry about this for now. Once you understand mono, stereo is easy.

Add those five things together and you’ll come up with a description of a PCM stream. For example: signed 16-bit little-endian mono @ 44.1kHz. In order to actually play audio, you’ll need to know those 5 things.

Various sound devices support various types of streams, but there’s usually a set list of sign, sample size, and endian-ness options. Different APIs use different constants to specify, but usually you’ll see them as something like S16LE (signed 16-bit little-endian) or S32BE (signed 32-bit big-endian) and so on.

In my next post, I’ll go over how those are represented in a PCM stream.

General , ,

Safari User Agent Strings Are Dumb

October 21st, 2009

Sometimes it’s hard to decide which company is worse: Apple or Microsoft. Apple’s web browser, Safari, has the dumbest User-Agent of any browser out there. Opera, Firefox, and even MSIE figured out how to do it correctly. You’d think Apple would be able to figure it out, too.

Generally, people want to know the major revision of a browser. The minor revision is usually not important since most of the time these are just used for reporting statistics. Well, as I’m here setting up a piece software that does simple string searches in the user-agent to determine browser version, I got hung up on Safari. Here’s a basic mapping of User-Agent search terms to corresponding browser versions:

opera6 opera/6
opera6 opera 6
opera7 opera/7
opera7 opera 7
galeon1 galeon/1
explorer3 msie 3.
explorer4 msie 4.
explorer5 msie 5.
explorer6 msie 6.
explorer7 msie 7.
explorer8 msie 8.
konqueror2 konqueror/2
konqueror3 konqueror/3
netscape6 netscape6
netscape7 netscape/7
netscape4 mozilla/4
netscape3 mozilla/3
firefox1 firefox/1
firefox2 firefox/2
safari1 safari/1
safari1 safari/85
safari1 safari/3
safari2 safari/4
safari4 safari/5
chrome chrome/0

So check out those Safari searches. The reason they look that way is because Safari doesn’t report it’s version number as Safari/X. Where there should be a major version number is a WebKit build number or some other useless value. Safari does put the actual version into the User-Agent string, but does it in such a fashion that you either need multiple string searches or regex to figure it out. Check out some Safari User-Agent strings:

Safari 3.1.2:
Mozilla/5.0 (...) AppleWebKit/525.18 (KHTML, like Gecko) Version/3.1.2 Safari/525.20.1

Safari 4:
Mozilla/5.0 (...) AppleWebKit/531.9 (KHTML, like Gecko) Version/4.0.3 Safari/531.9

So by matching Safari/5, you’re actually going to end up with either Safari 3.x or Safari 4.x. Or you could search for “Safari” and then search for “Version/X.” Or you could regex search for “/Version\/4.*Safari/” Of course, this software I’m configuring doesn’t support either of those methods because only a mind-bogglingly stupid browser maker would use this format in their User-Agent string.

Oh, and if you’re wondering why Safari/85 matches Safari 1.x, here’s some 1.x User-Agents:

Mozilla/5.0 (...) AppleWebKit/85.8.5 (KHTML, like Gecko) Safari/85.8.1
Mozilla/5.0 (...) AppleWebKit/125.5.6 (KHTML, like Gecko) Safari/125.12
Mozilla/5.0 (...) AppleWebKit/312.1 (KHTML, like Gecko) Safari/312

General

PulseAudio: An Async Example To Get Device Lists

October 13th, 2009

I have a love/hate relationship with PulseAudio. The PulseAudio simple API is… well…. simple. For 99% of the applications out there, you’ll rarely need anything more than the simple API. The documentation leaves a little to be desired, but it’s not to hard to figure out since you have the sample source code for pacat and parec.

The asynchronous API, on the other hand, is really complex. The learning curve isn’t really a curve. It’s more like a brick wall. Compounding the issue is that the documentation is atrocious. If you know exactly what you’re looking for and if you already know how it works, the documentation can be helpful.

More importantly, simple example code is nearly impossible to come by. So, since I took the time to figure it out, I figured I would document this here in the hopes that this little example will help someone else. This is not production ready code. There’s a lot of error checking that’s not being done. But this should at least give you an idea of how to use the PulseAudio asyncrhonous API.

Update: I spoke with the PulseAudio team and they encouraged me to put this source code on their wiki. So now you can find it at the main PulseAudio wiki: http://pulseaudio.org/wiki/SampleAsyncDeviceList

Read more…

General , , ,

Mangler: Ventrilo for Linux

October 7th, 2009

Many a gamer has fought with the Linux migration because of Ventrilo. Most games run just fine under wine, but when it comes to VOIP apps for gamers, Ventrilo is by far the leader. The official Linux Ventrilo client has been “in development” for about 3 or 4 years now.

A group of open source developers, over the past couple of years have spent a lot of time reverse engineering the Ventrilo protocol. Based on their work, we have created an almost functional client. It is stable (in the sense that it doesn’t crash) and you can receive audio if you use the Pulse Audio daemon.

The website for Mangler is http://www.mangler.org/.

For the latest developer info, we’ve set up a trac page at http://www.mangler.org/trac for bug reports and/or following our progress.

If you want to help and you have experience with C, C++, Linux audio systems (pulse, alsa, oss, etc), and/or GTK+, join us on irc.freenode.net in #mangler and see if you can help us. In the immediate short term, we need graphic designers (bring us a logo!). Interface designers are welcome as well. If you want to help make this happen, post a comment and/or join us on IRC.

General

Got a PS3? Want Hulu Back? And you’re a Windows user?

September 22nd, 2009

As a follow up to unblocking Hulu on the PS3, I’ve gotten tons of of responses. While that solution will successfully work in Windows, setting up squid on a Microsoft OS can be a painful task. Windows makes the simplest things incredibly difficult.

But squid was definitely designed for Linux, so that’s not incredibly surprising.

Thankfully, for the windows users out there, Jonathan Morales has this to contribute:

You can set this up in half the time using windows and an old program called proxomitron. You might want to post this for those who just want it to work easily:

step 1) Download proxomitron from http://www.proxomitron.info
step 2) open proxomitron and uncheck everything except the outgoing header filters
step 3) open outgoing header filters and uncheck them all, but find one of the user_agent ones and modify it as such:

  1. change the header name to user-agent:firefox win32
  2. (optional) in the url match, put *.hulu.com – this will only activate the header if you go to that url
  3. header value match should be *
  4. change the value to Mozilla/5.0 (Windows NT 5.1; en-US) Firefox/3.0.11
  5. make sure that the outgoing checkbox is checked but not the in

step 4) on the proxomitron main window go to config, click the access tab and set to allow connections from your local network e.g. 192.168.0.1 to 192.168.0.255

step 5) go to file>save default settings, and then close and reopen proxomitron

step 6) On the ps3 network settings, set it to use a proxy server and point it to the ip address of your windows machine running proxomitron, port 8080.

Other comments:

it’s a good practice to set your ip address on the windows machine that runs proxomitron to be static, so that there’s no chance it’ll change and your ps3 starts pointing to an ip address that doesn’t exist or is assigned to another computer.

General, PlayStation3 , , ,

The good old English to English translator

August 31st, 2009

Some years ago, for a reason that I don’t remember, I wrote an English to English translator. Basically it translates from English through a whole host of other languages and then back to English.

Anyhow, someone out there on the Internet found it amusing and posted this comment on an unrelated entry:

dude your english to english converter rocks, where can i find something like this for microsoft word or a web version that i can use on large documents

So I responded in email, which spawned this conversation:

random internet guy to me:
Thank you for returning to me, I am very happy with this tool is useful for something in the translation from English into another language besides English, they speak English well, or to say that, like language, queen of Great Britain is cold, this tool is that the majority of Britons seem silly, but it is a kind of spiritual support, he plays all the tribal mentality of Le Mans. Whether or not interest me, but it seems very impressive and culturally cool. But Google was that of syntax or algorithm that is used, because we noticed that in motion, or at least five different languages.

me to random internet guy:
I am glad that he was a useful tool. I do not know why and I said, a translator, but it must be useful, since it is often in response to people who actually come to conflicting messages. Sometimes I wish I had heard that people are sent e-mail, yes, but again I sometimes think that life is absurd afraid of me. I do not know better, what the hell is going on in the head. In any case, I am pleased that, as a useful and entertaining.

I love the Internet.

General

Time Lapse Video using gphoto2 and ffmpeg

August 30th, 2009

An interesting little project I’ve been working on is time lapse photography. I picked up a used Canon Powershot A520 pretty cheap, and set up a laptop with Ubuntu to communicate with the camera. I’m still working on the best angle to minimize the power lines out front, but I’ve got a good start going.

What you’ll need:

  • A Linux machine (a laptop really helps)
  • gphoto2 >= 2.4.5 (note that you can upgrade jaunty’s gphoto2 with the karmic packages to get this version)
  • A camera that supports remote capture
  • An AC power outlet near were you want to take your photos (and an AC adapter for your camera unless you have really awesome batteries).
  • jpeg2yuv and ffmpeg (with libx264 support)
  • Something relatively interesting to take pictures of

This is what I ended up with:

So here’s what I did:

  1. Connect the USB cable to the camera
  2. Run the following command (in a while loop in case it crashes):
    while true ; do
        gphoto2 --capture-and-download -I 30
    done

  3. Wait about 8 hours or so
    1. If you’re impatient like me, you can nfs mount the laptop after about 45 frames (about 20 minutes) and get a preview.
    2. You can rsync the laptop’s nfs mounted directory locally so you don’t have to copy the files over (most likely) wireless every time you want to encode the latest version
  4. Collect all of your images and make sure that each frame is numbered sequentially.
  5. Create an MPEG with jpeg2yuv by piping the output to ffmpeg:
    starframenum=XXXX # put the number of the first image in the sequence here
    jpeg2yuv -b $startframenum \
            -v 0 \
            -j the/path/to/your/images/IMG_%04d.JPG \
            -f 15 \
            -I p | ffmpeg -threads 2 -y -i - \
            -vcodec libx264 \
            -b 2500k \
            -acodec libfaac -ab 48k -ar 48000 -ac 2
            -s 1024x768 -f mp4 \
            outputfile.mp4

  6. In the options above, the important ones are ffmpeg “-f” which is the framerate. You can change this to speed up and or slow down your movie. The “-s” option is the size. Keep in mind that the width and height of your images needs to be a multiple of 16 (ie, 640×480, 1024×768, 1920×1152, etc). Note that 1080 is not divisble by 16. 1280×720 will work for widescreen (16:9) hi-def though. Lastly, the “-b” option is the video encoding bitrate. Increase it for better quality and decrease it for smaller output movie files.

    General ,

Netflix Has a Developer API

July 27th, 2009

I wasn’t incredibly happy with the movie synopses I was getting from IMDB. They’re generally pretty crappy. I went looking around to see if I could scrape the Netflix synopses, and lo-and-behold, Netflix has an API!

In another open source project I’m working on, I have a need to learn GTK+. So I figured the easy way to learn GTK+ was to start with php-gtk. It’s more-or-less a replica of the gtkmm OO interface, so I set out to update my little movie categorization script with a GTK+ interface. After learning the ropes, I finally have a nice interface that queries Netflix and returns all of their data for display.

This is what I have so far (keep in mind this is all in PHP):

screenshot-yflix-movie-categorizer-and-netflix-manager-1

When you click on a movie in the list, it queries netflix and fills out the description pane. So far it’s really simple, but hopefully I can use this to generate something that will categorize movies specifically for a uPNP client. I can’t put any source code out yet since I’m not too sure how the Netflix API deals with publishing an app. Right now, it has my personal developer key hard coded, and I only get 5000 queries per day.

Here’s a video (and of course, you’ll need Firefox 3.5):

General , ,

HTML 5 and the <video> tag!

July 11th, 2009

With Firefox 3.5, we get our first real chance to use the <video> tag. And man, it is awesome. If you’re using IE, Firefox 3.0, or Chrome, you won’t be able to see it. It seems there’s all kinds of debate about the codecs that the browser should support. Right now, Firefox uses ogg theora/vorbis.

Apple is apparently complaining about the ogg format. They say there are possible patent issues they want to avoid which is a weird claim since ogg is specifically designed to be free of such restrictions. More likely, they want to implement their own proprietary .mov format or possibly mp4. Either way, I’m sure it boils down to Apple wanting more money. This would also hamper open source project’s ability to use the video tag since there is no way to license the corporate proprietary formats.

Microsoft is strangely quiet about the whole thing. I’m guessing they’ll put out only WMV support and call it a wrap. They may just embed WMP into the webpage to handle the player. I wonder how long it will take to initialize that.

Anyhow, here is the video tag in action (OGG Theora variable bit rate, Vorbis: 96kb/s 2 channel):

Hopefully Sam and The Chin won’t mind me using this clip for a demonstration.

On a side note, is there a way to make Windows Media Player play H.264 encoded movies? Does it not support that capability?

General , ,

DVI to HDMI overscan (screen edge cutoff) on an HDTV

July 3rd, 2009

Update – 4/1/2010: Latest nVidia drivers have overscan correction built in

Well I learned something new recently. I have a friend that’s making the Ubuntu switch and he called me up with a bizarre problem. He’s using an nVidia card (although other cards have the same issue) with a DVI out port to a DVI->HDMI converter to an HDMI input on a 26″ HDTV that he uses as a monitor.

He called me up and described the problem and I confessed that I had never heard of this before. All 4 sides of his sides of his screen were getting cut off. He could only see part of his menu bars at the top and bottom and the left/right edges were cut off as well. After some Googling, I at least found the name for the problem: overscan.

And once I figured out the name, that’s when my Google searches became eye openers. There are a lot of people out there with overscan problems and there are very few solutions in Linux. The Windows nVidia drivers allow dynamic overscan correction inside of their driver toolbox. The X server nVidia drivers have no options (for DVI out… for TV out there apparently are).

The problem, as I understand it, is that the PC is sending a DVI PC style output, but the TV is reading a HDMI TV style input. As such, the TV thinks it’s receiving a TV signal and acts accordingly. If your TV has a DVI input, it should treat that as a PC input and give you 1:1 pixel mapping (which is what you’re looking for). If not, you’ll need to adjust for the overscan on the PC side. Some TVs even have an option to treat an HDMI signal as if it were PC. Check your TV’s manual.

Anyhow, there are a lot of people asking for help for this issue but is very hard to find any actual information.

Option 1 – Manually

I don’t know if this works, but it looks like good info. If you’re looking for a way to fix this (and you’re ready to spend quite a while doing it), you should read this:

Ubuntu Forums: Nvidia, Modelines, Overscan…8.10

Basically it’s trial and error to get the correct X server config’s Modeline. It’s mindboggling that no one (especially nVidia, which seems to care about Linux a little bit) has put out any definitive information on this topic.

Option 2 – A little less manually

I definitely don’t know if this works. I don’t know if anyone has even tried it. If this works/doesn’t work for you, post in the comments.

You can see if the Xfree modeline generator will give you something that works. I don’t really understand what all the modeline timings mean, but here’s a shot in the dark (You’re probably desparate at this point anyway… and I have no way of testing this so I don’t know if it even works at all). Also, I’ll give the same warning everyone gives on this… I take no responsibility at all of this damages your television. Try this at your own risk.

First things first, back up your xorg.conf file (/etc/X11/xorg.conf) somewhere safe (like your home directory).

I wrote a quick program that will help you determine your visible screen size:

Source: findcoords.c (source)
Binary: findcoords (compiled on Ubuntu 9.04)

If the binary doesn’t work for you or you’d prefer to compile from source, you’ll need the libx11 development packages installed (as well as the standard stuff like gcc and whatnot). On Ubuntu, running “sudo apt-get install build-essentials libx11-dev” should do the trick. To compile it run: gcc -lX11 -o findcoords findcoords.c

Now run it by typing ./findcoords

It’ll tell you to click the upper left and bottom right corners of the screen. Get as close as possible. You want the very point of the cursor as close to the edge as possible. That means in the bottom right, you should only be able to see about 1 pixel of your cursor. When you’ve done that it’ll calculate your viewable screen size. It will output something like this:

Root Window Size: 2880x900
Viewable Size: 2764x798
Your screen is cut off by the following number of pixels:
Left  : 31
Right : 85
Top   : 24
Bottom: 78

Armed with the actual visible screen size, head over to the XFree Modeline Calculator (it works for Xorg too).

1. Enter the values under “Monitor Configuration” if you know them. If not leave that section blank.
2. Under “Basic Configuration” enter the viewable size that got output from findcoords.
3. If you know the max refresh rate for your TV, you can enter it here. If not, just use 60Hz.
4. If you know the dot clock frequency enter it as well, otherwise, just leave it blank.
5. IMPORTANT: If you’re TV is interlaced at max resolution (i.e. 1080i), check the interlaced button.
6. Click the “Calculate Modeline” button and it should give you a modeline at the top of the screen.
7. In your xorg.conf file, put the modeline it gives you into the Monitor section
8. And this line to your Monitor section as well:

Option "ExactModeTimingsDVI" "TRUE"

9. Now, to use this, you’ll need to add this line to your Device section:

Option "UseEDID" "FALSE"

10. Then in the Display section, add a line that LOOKS like this, but define the mode specified in the modeline that the generator gave you:

Modes "1960x1080@60i"

In other words, if the modeline generator spit out:

Modeline "1816x980@60i" 65.89 1816 1848 2096 2128 980 1002 1008 1031 interlace

You would put the following in the Display section:

Modes "1816x980@60i"

That text has to match EXACTLY. When it’s all said and done, you should end up with an xorg.conf that looks something like this:

Section "Monitor"
    Identifier     "Monitor0"
    VendorName     "Unknown"
    ModelName      "DELL S199WFP"
    HorizSync       30.0 - 83.0
    VertRefresh     56.0 - 75.0
    Option         "ExactModeTimingsDVI" "TRUE"
    Modeline "1816x980@60i" 65.89 1816 1848 2096 2128 980 1002 1008 1031 interlace
EndSection

Section "Device"
    Identifier     "Device0"
    Driver         "nvidia"
    VendorName     "NVIDIA Corporation"
    BoardName      "GeForce 9800 GT"
    Option         "UseEDID" "FALSE"
EndSection

Section "Screen"
    Identifier     "Screen0"
    Device         "Device0"
    Monitor        "Monitor0"
    DefaultDepth    24
    SubSection     "Display"
        Depth       24
        Modes       "1816x980@60i"
    EndSubSection
EndSection

Give that a shot and see what you get. Can’t be much worse, can it? If it doesn’t work, just revert back to what you had by replacing your xorg.conf file from the backup. If you get any halfway decent results at all, let me know.

More terms to know:

1-to-1 pixel mapping: If your HDTV (as a monitor) supports this option, chances are this will solve your problem. This means that every pixel sent by the PC will be mapped to a pixel on the screen (i.e. disable overscan).

Full Pixel: This is the same as 1:1 pixel mapping

Modelines: Definitions of video modes that control the display size in the X server

Overscan: Part of standard TV input where a percentage of the edges of the screen are cut off. Not noticeable for normal TV viewing, but very noticeable on a PC desktop.

EDID: Monitor/TV device information telling the PC what modes are supported (stored in the monitor and not configurable)

Good luck.

General , , ,