Main Page

Threaded/Parallel Web Crawler (or Web Server Killing Software)

Short Version


Parallel URL Fetcher – If you want to put load on a webserver by crawling it, this is what you’re looking for. No java, no python, just a nice small, fast C program.

Long Version


It’s time to re-evaluate our HTTP caching software. At present we use Apache mod_cache (disk cache) and we’ve run into some problems.

Apache mod_cache + ZFS + millions of URLs and hundreds of gigs of cache files = bad

I’m not sure which of these guys is the culprit in this one. But I do know that when the ZFS dataset holding Apache’s cache gets to a certain size, disk I/O requests go through the roof. By clearing the cache (and freeing up that I/O), we see a good 5%-10% (extremely significant) jump in traffic.

At any rate, this prompted us to start looking into alternatives to Apache. The obvious first choice is Squid in accelerator mode. So I got Squid all set up in our offline datacenter, fixed the little things, and was ready the beat the crap out of it with web requests.

I can easily request all of our 500k+ “static” URLs, but those pesky URLs with arguments aren’t quite that easy. I needed a crawler. Something like wget –mirror but much, much, much faster.

After a lot of searching, I found a few python apps that failed to compile on Solaris, had deprecated/old dependencies, required specific python, etc. Python is starting to feel more and more like Java. Either the developers are horrible or the language interpreter is too picky to work properly (think…. JRE 1.2.5 build 1482???? no no no, you need build 1761!!!).

Speaking of Java, I also found a Java app (JCrawler) that looked perfect for what I needed. It certainly claimed to be “perfect.” It actually worked better than the Python apps that failed to build/run properly, but it didn’t actually work. It just kept spawning threads until it ran out of memory.

I was almost to the point where I thought I would have to write one myself, until I clicked on a link and a bright light from the heavens shown down on my monitor and a choir started singing in the background.

I had found the Parallel URL Fetcher. It was exactly what I needed. It was like wget, but ran parallel requests. It didn’t compile on Solaris either, but adding timeradd() and timersub() macros fixed that real quick.

I don’t think it supports Keep-Alive requests either, which would have been nice, but either way it rocked through some URLs. After letting it run for a few hours, I had my Squid server maxed out at 100Gigs of cache and ready for some I/O testing.


Posted by: eric on 01/26/2010

PCM Audio | Part 3: Basic Audio Effects ? Volume Control

So now we know what data is stored in a PCM stream, let’s look at some real waveform examples. The easiest is a simple sine wave:

sine wave

Now if we “amplify” that wave by 5, we’d get a much louder sound, represented by a wave that looked like this:

sine wave times 10

So if you want to increase the volume of your PCM stream, just multiply every PCM value by some number. If we had 2048 bytes of audio (remember… that’s 1024 samples since each sample is two bytes), we could amplify the stream with this type of code:

int16_t pcm[1024] = read in some pcm data;
for (ctr = 0; ctr < 1024; ctr++) {
    pcm[ctr] *= 2;
}

Volume control is almost that simple. There's two catches.

Clipping

Clipping occurs when your resulting value increases above the maximum value for a sample. So since we're dealing with signed 16 bit integers our maximum positive sample is 32767. If we have a PCM sample value of 5000 and we multiplied it by 10, the resulting value is -15536, not the expected 50000. When clipping occurs, you end up with noise in the audio. You should always check to see if the result of your multiplication would cause clipping, and if so, set the value to 32767 (or -32768) instead.

So our code above becomes:

int16_t pcm[1024] = read in some pcm data;
int32_t pcmval;
for (ctr = 0; ctr < 1024; ctr++) {
    pcmval = pcm[ctr] * 2;
    if (pcmval < 32767 && pcmval > -32768) {
        pcm[ctr] = pcmval
    } else if (pcmval > 32767) {
        pcm[ctr] = 32767;
    } else if (pcmval < -32768) {
        pcm[ctr] = -32768;
    }
}

Volume Is Logarithmic

The other catch is that volume as perceived by humans (measured in decibels) is logarithmic, not linear. Your first instinct would be to think "Well if I wanted to double the volume, I should just multiply the samples by 2." Unfortunately, it's not quite that easy.

Multiplying a value by 1 will obviously give you no amplification. So to decrease volume, you would multiply by a value less than 1 and greater than 0. To increase volume, multiply by a number greater than one. Unfortunately, I didn't pay enough attention to logarithms in school, so I don't have a clever answer as to how to implement a proper volume control, but I've found that this function works pretty well:

int some_level;
float multiplier = tan(some_level/100.0);

If some_level is set to a value between 0 and 148 or so, this will give you a rather linear sounding multiplier. 79 is almost a multiplier of 1 (no amplification). It is far -- really far -- from perfect, but it worked well enough for my needs of implementing a volume slider. Graphing that function from 0 to 148 gives you this:

volume multiplier

So to set an appropriate level, now we have a volume slider at 39 (roughly half volume):

int16_t pcm[1024] = read in some pcm data;
int32_t pcmval;
uint8_t level = 39; // half as loud
// uint8_t level = 118 // twice as loud (79 * 1.5)
float multiplier = tan(level/100.0);
for (ctr = 0; ctr < 1024; ctr++) {
    pcmval = pcm[ctr] * multiplier;
    if (pcmval < 32767 && pcmval > -32768) {
        pcm[ctr] = pcmval
    } else if (pcmval > 32767) {
        pcm[ctr] = 32767;
    } else if (pcmval < -32768) {
        pcm[ctr] = -32768;
    }
}

I wasn't able to find a simple logarithmic slider example, so if you have one, please post in the comments. I'd love to replace my hack.

Using some simple algorithms and that function above, you could easily implement a fade-in/out effect on PCM data by stepping through all 148 possible values over a period of time. And don't worry, we'll get to "time" later in the series.

That's pretty much all there is to know about volume, in the next part of the series, we're going to discuss mixing two streams together to create one stream.


Posted by: eric on 01/11/2010

PCM Audio | Part 2: What does a PCM stream look like?

In Part 1, we looked at how a PCM stream is described. Once you know all of the parameters for your PCM stream, we can examine the data and put it in memory as useful data.

So, let’s assume we have a file that contains signed 16-bit little endian mono PCM. That means that data in the file is just a collection of 16 bit integers. Each integer represents one sample. So the first 9 samples in the file could be:

+------+------+------+------+------+------+------+------+------+
|  500 |  300 | -100 | -20  | -300 |  900 | -200 |  -50 |  250 |
+------+------+------+------+------+------+------+------+------+

Each of those integers is stored in the file as 2 bytes (16-bit), so the 9 samples above take up 18 bytes of space. The value of each sample, obviously, can range from -32768 to 32767. If you take those samples and plot them on a graph, you’ll end up with a visualization of the waveform for the audio that you see in your music player.

If we wanted to read that into an array in C, we would do something like this (obviously this is pseudo-code):

FILE *pcmfile
int16_t *pcmdata;
pcmfile = fopen(your pcm data file);
pcmdata = malloc(size of the file);
fread(pcmdata, sizeof(int16_t), size of file / sizeof(int16_t), pcmfile);

Of course, if you’re dealing with large files, you probably shouldn’t read the whole thing into memory. You should buffer the data and read it in chunks at a time.

If you take that data and send it to your sound card, you’ll hear the sample being played. However, the sound card will require you to know the sample rate. If you have an 8kHz stream and tell the sound card to play it at 16kHz, it’s like playing a 33.3 RPM record at 45 RPM. For the younger crowd out there, that means it will be too fast and it’ll be high pitched… think Alvin and the Chipmunks here.

Since this is a description of the waveform, a stream of all zeros would be silence (a flat line if you graphed it).

I haven’t really explained what those samples actually MEAN though… just what they are. It will be incredibly obvious what those samples mean starting in the next post, when we get to the fun stuff: basic audio effects processing (don’t get scared… it’s actually really easy).


Posted by: eric on 01/09/2010

PCM Audio | Part 1: What is PCM?

It’s been a long time since I posted anything. Most of my free time has been spent working on my ventrilo client for linux project. Of course, that project adds tons of things to discuss, such as how PCM audio works. I’m going to make this a multi-part series, because there is so much information to discuss.

When I first started working on that project, I knew nothing about how audio worked. I knew a little bit about encoders and decoders, but not really the inner workings. What are they encoding/decoding? It turns out, that the answer is PCM (pulse control modulation) audio. After messing with PCM for a few months, there are a lot of things that are painfully obvious now that were confusing. This guide is meant to be an introduction to at least give you the working knowledge you’ll need to ask proper questions and perform simple tasks. So let’s get started…

If you’ve ever used a computer MP3 player, you’ve probably seen those options to display the waveform of the audio or the little bars that pop up and down showing you treble and bass levels. What those are measuring is the PCM audio as it plays it. So what does all that crap mean?

Let’s start with the basics. There’s five terms that are important to know for PCM:

Sample Rate

Real actual audio (like someone talking to you in person) is transmitted as a wave. PCM is a digital representation of that audio wave at a specified sample rate. The sample rate is measured in Hz (cycles per second) and more often in kilohertz. So when you hear someone talk about about 128kHz vs. 160kHz audio, what they’re talking about is the sample rate. If you’ve ever done integrals in calculus, it’s a lot like that. The higher the sample rate, the better your quality (at the cost of size). There is no guessing here. You need to know what the sample rate is.

Sign

Whether the data is signed or unsigned. It is almost always signed. Treating a signed PCM stream as unsigned will hurt your ears… painfully… (I speak from experience here).

Sample Size

This determines how many bits make up one sample. 16-bit seems to be the most common.

Byte Ordering

Byte ordering refers to little-endian vs. big-endian data. If you don’t know what endian-ness means, you can probably assume little endian. If you have the option to choose endian for your data, you should always choose little-endian.

Number of channels

I’m mostly going to cover mono (1 channel), but multichannel PCM is usually handled by interleaving the PCM samples. Don’t worry about this for now. Once you understand mono, stereo is easy.

Add those five things together and you’ll come up with a description of a PCM stream. For example: signed 16-bit little-endian mono @ 44.1kHz. In order to actually play audio, you’ll need to know those 5 things.

Various sound devices support various types of streams, but there’s usually a set list of sign, sample size, and endian-ness options. Different APIs use different constants to specify, but usually you’ll see them as something like S16LE (signed 16-bit little-endian) or S32BE (signed 32-bit big-endian) and so on.

In my next post, I’ll go over how those are represented in a PCM stream.


Posted by: eric on 01/08/2010

Netflix on the PS3

I would be remiss to not have a blog post about Netflix on the PS3. As much as I post about streaming video to the PS3 and as much as I love Netflix, I can’t resist chiming in on this one.

First of all, I don’t care what the CEO of Netflix says, but having to put in a disc to stream movies sucks. The streaming app should be an installable application that sits on the XMB.

It’s not a matter of being lazy, it is a matter of convenience. Back on the PS2, when I started working on streaming video to Sony devices using BroadQ (oh yeah… btw, i’ve been working on this for about 6 or 7 years now), it was annoying to have to load in the BroadQ disc to stream the movies. I can’t imagine it will be any less annoying 7 years later when the system has a hard drive that is perfectly capable of storing the application.

That said, I’m excited this is finally happening. There’s little doubt that Microsoft opened up the checkbook to prevent interoperability. Netflix will be available for the PS3 almost exactly one year after XBox. For games, I can understand these exclusive agreements. For third party services such as Netflix, I think it’s a dick move on Microsoft’s part. I view it as yet another good reason not to support their console.

As a Netflix subscriber, I think it’s a bad move by both Netflix and Microsoft. This should have happened long ago.


Posted by: eric on 10/29/2009

Safari User Agent Strings Are Dumb

Sometimes it’s hard to decide which company is worse: Apple or Microsoft. Apple’s web browser, Safari, has the dumbest User-Agent of any browser out there. Opera, Firefox, and even MSIE figured out how to do it correctly. You’d think Apple would be able to figure it out, too.

Generally, people want to know the major revision of a browser. The minor revision is usually not important since most of the time these are just used for reporting statistics. Well, as I’m here setting up a piece software that does simple string searches in the user-agent to determine browser version, I got hung up on Safari. Here’s a basic mapping of User-Agent search terms to corresponding browser versions:

opera6 opera/6
opera6 opera 6
opera7 opera/7
opera7 opera 7
galeon1 galeon/1
explorer3 msie 3.
explorer4 msie 4.
explorer5 msie 5.
explorer6 msie 6.
explorer7 msie 7.
explorer8 msie 8.
konqueror2 konqueror/2
konqueror3 konqueror/3
netscape6 netscape6
netscape7 netscape/7
netscape4 mozilla/4
netscape3 mozilla/3
firefox1 firefox/1
firefox2 firefox/2
safari1 safari/1
safari1 safari/85
safari1 safari/3
safari2 safari/4
safari4 safari/5
chrome chrome/0

So check out those Safari searches. The reason they look that way is because Safari doesn’t report it’s version number as Safari/X. Where there should be a major version number is a WebKit build number or some other useless value. Safari does put the actual version into the User-Agent string, but does it in such a fashion that you either need multiple string searches or regex to figure it out. Check out some Safari User-Agent strings:

Safari 3.1.2:
Mozilla/5.0 (...) AppleWebKit/525.18 (KHTML, like Gecko) Version/3.1.2 Safari/525.20.1

Safari 4:
Mozilla/5.0 (...) AppleWebKit/531.9 (KHTML, like Gecko) Version/4.0.3 Safari/531.9

So by matching Safari/5, you’re actually going to end up with either Safari 3.x or Safari 4.x. Or you could search for “Safari” and then search for “Version/X.” Or you could regex search for “/Version\/4.*Safari/” Of course, this software I’m configuring doesn’t support either of those methods because only a mind-bogglingly stupid browser maker would use this format in their User-Agent string.

Oh, and if you’re wondering why Safari/85 matches Safari 1.x, here’s some 1.x User-Agents:

Mozilla/5.0 (...) AppleWebKit/85.8.5 (KHTML, like Gecko) Safari/85.8.1
Mozilla/5.0 (...) AppleWebKit/125.5.6 (KHTML, like Gecko) Safari/125.12
Mozilla/5.0 (...) AppleWebKit/312.1 (KHTML, like Gecko) Safari/312

Posted by: eric on 10/21/2009

PulseAudio: An Async Example To Get Device Lists

I have a love/hate relationship with PulseAudio. The PulseAudio simple API is… well…. simple. For 99% of the applications out there, you’ll rarely need anything more than the simple API. The documentation leaves a little to be desired, but it’s not to hard to figure out since you have the sample source code for pacat and parec.

The asynchronous API, on the other hand, is really complex. The learning curve isn’t really a curve. It’s more like a brick wall. Compounding the issue is that the documentation is atrocious. If you know exactly what you’re looking for and if you already know how it works, the documentation can be helpful.

More importantly, simple example code is nearly impossible to come by. So, since I took the time to figure it out, I figured I would document this here in the hopes that this little example will help someone else. This is not production ready code. There’s a lot of error checking that’s not being done. But this should at least give you an idea of how to use the PulseAudio asyncrhonous API.

Update: I spoke with the PulseAudio team and they encouraged me to put this source code on their wiki. So now you can find it at the main PulseAudio wiki: http://pulseaudio.org/wiki/SampleAsyncDeviceList


Save the code below as pulsedevlist.c and compile this program with: gcc -Wall -o pulsedevlist pulsedevlist.c -lpulse

#include <stdio.h>
#include <string.h>
#include <pulse/pulseaudio.h>

// Field list is here: http://0pointer.de/lennart/projects/pulseaudio/doxygen/structpa__sink__info.html
typedef struct pa_devicelist {
	uint8_t initialized;
	char name[512];
	uint32_t index;
	char description[256];
} pa_devicelist_t;

void pa_state_cb(pa_context *c, void *userdata);
void pa_sinklist_cb(pa_context *c, const pa_sink_info *l, int eol, void *userdata);
void pa_sourcelist_cb(pa_context *c, const pa_source_info *l, int eol, void *userdata);
int pa_get_devicelist(pa_devicelist_t *input, pa_devicelist_t *output);

// This callback gets called when our context changes state.  We really only
// care about when it's ready or if it has failed
void pa_state_cb(pa_context *c, void *userdata) {
	pa_context_state_t state;
	int *pa_ready = userdata;

	state = pa_context_get_state(c);
	switch  (state) {
		// There are just here for reference
		case PA_CONTEXT_UNCONNECTED:
		case PA_CONTEXT_CONNECTING:
		case PA_CONTEXT_AUTHORIZING:
		case PA_CONTEXT_SETTING_NAME:
		default:
			break;
		case PA_CONTEXT_FAILED:
		case PA_CONTEXT_TERMINATED:
			*pa_ready = 2;
			break;
		case PA_CONTEXT_READY:
			*pa_ready = 1;
			break;
	}
}

// pa_mainloop will call this function when it's ready to tell us about a sink.
// Since we're not threading, there's no need for mutexes on the devicelist
// structure
void pa_sinklist_cb(pa_context *c, const pa_sink_info *l, int eol, void *userdata) {
    pa_devicelist_t *pa_devicelist = userdata;
    int ctr = 0;

    // If eol is set to a positive number, you're at the end of the list
    if (eol > 0) {
	return;
    }

    // We know we've allocated 16 slots to hold devices.  Loop through our
    // structure and find the first one that's "uninitialized."  Copy the
    // contents into it and we're done.  If we receive more than 16 devices,
    // they're going to get dropped.  You could make this dynamically allocate
    // space for the device list, but this is a simple example.
    for (ctr = 0; ctr < 16; ctr++) {
	if (! pa_devicelist[ctr].initialized) {
	    strncpy(pa_devicelist[ctr].name, l->name, 511);
	    strncpy(pa_devicelist[ctr].description, l->description, 255);
	    pa_devicelist[ctr].index = l->index;
	    pa_devicelist[ctr].initialized = 1;
	    break;
	}
    }
}

// See above.  This callback is pretty much identical to the previous
void pa_sourcelist_cb(pa_context *c, const pa_source_info *l, int eol, void *userdata) {
    pa_devicelist_t *pa_devicelist = userdata;
    int ctr = 0;

    if (eol > 0) {
	return;
    }

    for (ctr = 0; ctr < 16; ctr++) {
	if (! pa_devicelist[ctr].initialized) {
	    strncpy(pa_devicelist[ctr].name, l->name, 511);
	    strncpy(pa_devicelist[ctr].description, l->description, 255);
	    pa_devicelist[ctr].index = l->index;
	    pa_devicelist[ctr].initialized = 1;
	    break;
	}
    }
}

int pa_get_devicelist(pa_devicelist_t *input, pa_devicelist_t *output) {
    // Define our pulse audio loop and connection variables
    pa_mainloop *pa_ml;
    pa_mainloop_api *pa_mlapi;
    pa_operation *pa_op;
    pa_context *pa_ctx;

    // We'll need these state variables to keep track of our requests
    int state = 0;
    int pa_ready = 0;

    // Initialize our device lists
    memset(input, 0, sizeof(pa_devicelist_t) * 16);
    memset(output, 0, sizeof(pa_devicelist_t) * 16);

    // Create a mainloop API and connection to the default server
    pa_ml = pa_mainloop_new();
    pa_mlapi = pa_mainloop_get_api(pa_ml);
    pa_ctx = pa_context_new(pa_mlapi, "test");

    // This function connects to the pulse server
    pa_context_connect(pa_ctx, NULL, 0, NULL);

    // This function defines a callback so the server will tell us it's state.
    // Our callback will wait for the state to be ready.  The callback will
    // modify the variable to 1 so we know when we have a connection and it's
    // ready.
    // If there's an error, the callback will set pa_ready to 2
    pa_context_set_state_callback(pa_ctx, pa_state_cb, &pa_ready);

    // Now we'll enter into an infinite loop until we get the data we receive
    // or if there's an error
    for (;;) {
	// We can't do anything until PA is ready, so just iterate the mainloop
	// and continue
	if (pa_ready == 0) {
	    pa_mainloop_iterate(pa_ml, 1, NULL);
	    continue;
	}
	// We couldn't get a connection to the server, so exit out
	if (pa_ready == 2) {
	    pa_context_disconnect(pa_ctx);
	    pa_context_unref(pa_ctx);
	    pa_mainloop_free(pa_ml);
	    return -1;
	}
	// At this point, we're connected to the server and ready to make
	// requests
	switch (state) {
	    // State 0: we haven't done anything yet
	    case 0:
		// This sends an operation to the server.  pa_sinklist_info is
		// our callback function and a pointer to our devicelist will
		// be passed to the callback The operation ID is stored in the
		// pa_op variable
		pa_op = pa_context_get_sink_info_list(pa_ctx,
			pa_sinklist_cb,
			output
			);

		// Update state for next iteration through the loop
		state++;
		break;
	    case 1:
		// Now we wait for our operation to complete.  When it's
		// complete our pa_output_devicelist is filled out, and we move
		// along to the next state
		if (pa_operation_get_state(pa_op) == PA_OPERATION_DONE) {
		    pa_operation_unref(pa_op);

		    // Now we perform another operation to get the source
		    // (input device) list just like before.  This time we pass
		    // a pointer to our input structure
		    pa_op = pa_context_get_source_info_list(pa_ctx,
			    pa_sourcelist_cb,
			    input
			    );
		    // Update the state so we know what to do next
		    state++;
		}
		break;
	    case 2:
		if (pa_operation_get_state(pa_op) == PA_OPERATION_DONE) {
		    // Now we're done, clean up and disconnect and return
		    pa_operation_unref(pa_op);
		    pa_context_disconnect(pa_ctx);
		    pa_context_unref(pa_ctx);
		    pa_mainloop_free(pa_ml);
		    return 0;
		}
		break;
	    default:
		// We should never see this state
		fprintf(stderr, "in state %d\n", state);
		return -1;
	}
	// Iterate the main loop and go again.  The second argument is whether
	// or not the iteration should block until something is ready to be
	// done.  Set it to zero for non-blocking.
	pa_mainloop_iterate(pa_ml, 1, NULL);
    }
}

int main(int argc, char *argv[]) {
    int ctr;

    // This is where we'll store the input device list
    pa_devicelist_t pa_input_devicelist[16];

    // This is where we'll store the output device list
    pa_devicelist_t pa_output_devicelist[16];

    if (pa_get_devicelist(pa_input_devicelist, pa_output_devicelist) < 0) {
	fprintf(stderr, "failed to get device list\n");
	return 1;
    }

    for (ctr = 0; ctr < 16; ctr++) {
	if (! pa_output_devicelist[ctr].initialized) {
	    break;
	}
	printf("=======[ Output Device #%d ]=======\n", ctr+1);
	printf("Description: %s\n", pa_output_devicelist[ctr].description);
	printf("Name: %s\n", pa_output_devicelist[ctr].name);
	printf("Index: %d\n", pa_output_devicelist[ctr].index);
	printf("\n");
    }

    for (ctr = 0; ctr < 16; ctr++) {
	if (! pa_input_devicelist[ctr].initialized) {
	    break;
	}
	printf("=======[ Input Device #%d ]=======\n", ctr+1);
	printf("Description: %s\n", pa_input_devicelist[ctr].description);
	printf("Name: %s\n", pa_input_devicelist[ctr].name);
	printf("Index: %d\n", pa_input_devicelist[ctr].index);
	printf("\n");
    }
    return 0;
}


Posted by: eric on 10/13/2009

Mangler: Ventrilo for Linux

Many a gamer has fought with the Linux migration because of Ventrilo. Most games run just fine under wine, but when it comes to VOIP apps for gamers, Ventrilo is by far the leader. The official Linux Ventrilo client has been “in development” for about 3 or 4 years now.

A group of open source developers, over the past couple of years have spent a lot of time reverse engineering the Ventrilo protocol. Based on their work, we have created an almost functional client. It is stable (in the sense that it doesn’t crash) and you can receive audio if you use the Pulse Audio daemon.

The website for Mangler is http://www.mangler.org/.

For the latest developer info, we’ve set up a trac page at http://www.mangler.org/trac for bug reports and/or following our progress.

If you want to help and you have experience with C, C++, Linux audio systems (pulse, alsa, oss, etc), and/or GTK+, join us on irc.freenode.net in #mangler and see if you can help us. In the immediate short term, we need graphic designers (bring us a logo!). Interface designers are welcome as well. If you want to help make this happen, post a comment and/or join us on IRC.


Posted by: eric on 10/07/2009

Got a PS3? Want Hulu Back? And you?re a Windows user?

As a follow up to unblocking Hulu on the PS3, I’ve gotten tons of of responses. While that solution will successfully work in Windows, setting up squid on a Microsoft OS can be a painful task. Windows makes the simplest things incredibly difficult.

But squid was definitely designed for Linux, so that’s not incredibly surprising.

Thankfully, for the windows users out there, Jonathan Morales has this to contribute:

You can set this up in half the time using windows and an old program called proxomitron. You might want to post this for those who just want it to work easily:

step 1) Download proxomitron from http://www.proxomitron.info
step 2) open proxomitron and uncheck everything except the outgoing header filters
step 3) open outgoing header filters and uncheck them all, but find one of the user_agent ones and modify it as such:

  1. change the header name to user-agent:firefox win32
  2. (optional) in the url match, put *.hulu.com – this will only activate the header if you go to that url
  3. header value match should be *
  4. change the value to Mozilla/5.0 (Windows NT 5.1; en-US) Firefox/3.0.11
  5. make sure that the outgoing checkbox is checked but not the in

step 4) on the proxomitron main window go to config, click the access tab and set to allow connections from your local network e.g. 192.168.0.1 to 192.168.0.255

step 5) go to file>save default settings, and then close and reopen proxomitron

step 6) On the ps3 network settings, set it to use a proxy server and point it to the ip address of your windows machine running proxomitron, port 8080.

Other comments:

it’s a good practice to set your ip address on the windows machine that runs proxomitron to be static, so that there’s no chance it’ll change and your ps3 starts pointing to an ip address that doesn’t exist or is assigned to another computer.


Posted by: eric on 09/22/2009

The good old English to English translator

Some years ago, for a reason that I don’t remember, I wrote an English to English translator. Basically it translates from English through a whole host of other languages and then back to English.

Anyhow, someone out there on the Internet found it amusing and posted this comment on an unrelated entry:

dude your english to english converter rocks, where can i find something like this for microsoft word or a web version that i can use on large documents

So I responded in email, which spawned this conversation:

random internet guy to me:
Thank you for returning to me, I am very happy with this tool is useful for something in the translation from English into another language besides English, they speak English well, or to say that, like language, queen of Great Britain is cold, this tool is that the majority of Britons seem silly, but it is a kind of spiritual support, he plays all the tribal mentality of Le Mans. Whether or not interest me, but it seems very impressive and culturally cool. But Google was that of syntax or algorithm that is used, because we noticed that in motion, or at least five different languages.

me to random internet guy:
I am glad that he was a useful tool. I do not know why and I said, a translator, but it must be useful, since it is often in response to people who actually come to conflicting messages. Sometimes I wish I had heard that people are sent e-mail, yes, but again I sometimes think that life is absurd afraid of me. I do not know better, what the hell is going on in the head. In any case, I am pleased that, as a useful and entertaining.

I love the Internet.


Posted by: eric on 08/31/2009