kaektech

2007.02.05: The way ahead

I am calling this piece of software 'The Hedge' for now. I have reconfigured the CNC site to be the new home for this project. I have decided that in order for this project to achieve the fit and finish it will need to work reliably I will need outside assistance, so I am going to make all of the code open source under the GPL license and begin actively seeking development assistance from sources outside my head.

That said, I have suprised myself with how swiftly I am able to marshall information thanks to the technology available to me. I have found all sorts of diverse snippets of code that each fulfill a niche in the overall design of this project. My task now is to integrate all of these snippets into a cohesive, functional product. I am fully certain that if I attempted a project of this scale in any other language besides Python that I would be lost beyond pale, and likely unable to execute a working design in any sort of reasonable time horizon.

I probably won't keep this page very up to date except for gee-whizz hardware stuff (soon to come... audio and video and distributed computing experiments, oh my!) and major announcements regarding the Hedge software. For regular updates on the software I recomend that you visit the new and improved project page linked to above.

Significant progress

I have finally made signifcant headway into how all of this is fancy computer stuff is gonna work ('bout time, eh?)

I have begun, as of today, 2007.01.17, creating a program that will grab market data off the internet, parse it, calculate based off of it, and deliver results.

At first glance, this is not terribly revolutionary. Nothing in the previous sentence could not be done by, say http://finance.yahoo.com, or some other such entity. The magic lies in the "calculate based off it" phrase, and my fancy maths. I am going to deliver a measure of volitility, applicable to any asset, and I'll create a presentation format that allows the comparison of volitility across a maximally diverse set of asset classes. To well trained eye, many refinements of hedging strategies abound here...

Network Topology

If you are curious as to how this cluster is laid out, I present the following ChartPorn courtesy of MS Visio, 'cause it was here, & why not.

Progress

I have read fairly far into the maths of fractal geometry and downloaded some software to try out the fancy maths. I am currently running SciLab with a partially operational FracLab module. I don't yet know if the fancy math will scale over my cluster, but I'm optimistic. SciLab appears at first glance to be fully compatible with MatLab, and Matlab has been reported to distribute processes well under openMosix. It remians to be determined if SciLab plays well also. Teat.Report.Repeat.

For the curious

The Supacomputa broadcasts itself at a certain IP address, which I will gladly provide for you provided I can determine who you are. Upon determination of your identity I will email you a login and password to enter the SupaComputa over the internet. SSL (preferred, but not all services available), Gopher (curiosity), FTP (up/down - transfer), BitTorrent & Gnutella (P2P network transfers), and other protocols available upon request.

Why

Why not? Actually, my original idea was to be able to stream video and audio to anywhere. Now this application does not require a computer cluster, per se, but ancilliary processes asscociated with this application could gain some benefit, such as file format conversion, video rendering, etc. Then I came upon a facinating line of economic research which is poorly labeled "econophysics." In order to fully explore this field, I needed the ability to run complex mathematical simulations at my house (since I am no longer a part of the "academy"). And those simulations can be optimized to run on a computing cluster so that the results of a particular calculation are rendered more efficiently, and I like efficiency.

Hardware

Oh, any 'ole PC will suffice. Seriously. My current setup is a bunch of junk computers that I resurrected from the dead by replace bad RAM modules or bad hard drives or bad ethernet cards. They, for the time being, are all Pentium III machines with between 256MB and 1GB of RAM, and they all can talk over a network. Theoretically I can make about any old PC work in this environment, however faster processors and more RAM are, obviously, preferred. The magic is in the software, not the hardware.

Operating system

Post-system-setup I have come across a number of sources that document setting a computing cluster using Windows XP, and even more support for the idea is delivered by Apple OSX, but nothing beats the variety of Linux varients for this type of project. Such is the avenue I took.

Utilizing Linux as the OS and wanting to create a cluter, one has multiple options. You can create the "OG" of cluster types, known as a Beowulf cluster, but I found the documentation to be poorer than another alternative. The route I took was to create an openMosix cluster. I dig the fact that it is well documented (cluster computing can be tricky), and it appears to have solid developer network.

OpenMosix is, fundamentally, applied at the kernel level by taking a plain Linux kernel, applying a patch, and installing supporting application software. The openMosix project has a kernel patch in development for the newest Linux varient 2.6.*, but the most stable patch is for kernel 2.4.26, which is what my machines run.

OK, in Engrish

A computer cluster is a bunch of boxes that are configured to run in parallel. They normally share a common operating system and _basic_ hardware makeup that allows them to "divide-and-conquer." Say you have a PC and it is running a program. That program can be one monolithic process, or it can be many smaller processes spawned by a common user interface. If your program is of the latter type, it can be distributed, meaning each "daughter" process can be launched, sent to a processor with available resources, and return its results back to the main interface. Running, say, Apache webserver is a monolithic process, but encoding a dozen (or a thousand) raw music files to .mp3 is not, and can be distributed. Certain mathematical applications are also good candidates for distributed, or cluster, computing.

Applications

For absolute basic cluster functionality I needed to install the openMosix user tools. Next, i installed a nice interface for these tools called openMosixView. Once all of these are running (no small task), then all sorts of goodies become available, such as a web-based GUI, other adminstration toys, and of course, real stuff to do. I am running a distributed video rendering program called Dr. Queue, and also exploring the world of QuantLib and CHPOX.

Some Links

i need to fill this empty space. email me a suggestion.

hiddenpic hiddenpic hiddenpic hiddenpic hiddenpic hiddenpic hiddenpic hiddenpic