Saturday, February 22, 2014

RTL_POWER data in Hadoop/HDFS using Impala and GNUPLOT

I am running rtl_power from 12 RTL dongles now 100Mhz per device with a resolution of 1.3Khz over 1 minute.

Today I found something interesting (finally, not just the boring stuff).  See post here .

Anyway, after finding that, I decided to put "Impala" to use for the first time for real since I setup the Hadoop/HDFS/Hive/Impala Cluster.

Every hour I import the rtl_power data that's been post processed and placed in LZO compressed files into Hadoop's HDFS using a Hive Loader. (more on that some other time) That puts the data into a Table within Hive.  Impala can search these LZO compressed logs VERY quickly!  So I decided to put this process to the test today (finally).

Using the following commands within about 6 seconds I created the below graph using GNUPLOT which shows what I think are NOAA weather Satellite Transmissions.

  • impala-shell -d prod -r -i node2 -f wx_sat.sql -o /tmp/wx_sat.dat --output_delimiter=, -B
  • gnuplot wx_sat
  • gimp /var/www/mufmon/wx_sat.2014-02-22.20.png
wx_sat.sql looks like this:

SELECT ldate,freq,dbm

FROM powerlog
WHERE logdate='2014-02-22'
AND loghour >= '17'
AND hz_start = 100000000
and freq between 137100000 and 137800000;

The gnuplot for this looks like this:

The white line within the graph was when I was testing and calibrating the RTL so it was not generating any rtl_power data during that time.

The two Dark spaces near the bottom are when the LNA was turned off for mainenance.

The 'doppler-like' lines I believe are NOAA and/or MET weather sat's.

137.2 - 137.25Mhz (interesting ???)

Not really sure what this is...the bottom graph is 18z and the top graph is 19z each are 1 hour where the oldest time is at the bottom of the respective graph.

What would be in 137.2Mhz that has a slow doppler shift like this?  I'm thinking Weather Sat's...

Sunday, February 16, 2014

CDAR/MUFMON Hadoop Cluster Lives

I've got 3 datanodes running Hadoop now in a basic configuration (need more storage drives which are ordered, and also need two new 1gb hubs which are also ordered).  Here is a photo of the new cluster running it's not much to look at, but is a basic functional Hadoop cluster now:

RTL SDR RF to Hadoop Front End (0-1.7Ghz @ 1.3Khz resolution)

This is still a work in progress.  However this is what I've done so far.  All of this has been tested outside of this setup previously.  Now I'm just trying to clean things up, and get them organized for the long-haul.

I 'gutted' a 1U rack mount server case from an old Penguin-Computing box I've had in storage for many years, to use as a 'home' for all the RTL dongles in the project.  This will help keep things organized, and keep things from getting damaged, or messed up over time.

So far I have enough for 12 Dongles which gets me from 0-1.2Ghz with one of the dongles being used with an HF upconverter.  (Not snown in the photo's so far) but it will be the one farthest to the left in the top photo.

I'm using the mini-dongles from Nooelec, which seem to be quite good in testing so far.  As good if not better than the full-sized ones sold by the same company.

Pictured is a 1.5Amp 15v supply.  Which is way overkill for the two LNA's I'll be using.

I've ordered a +15v Meanwell Switching Power Supply that should be here in a week or so.  That will power all of the "Coolgear" powered-usb-hubs, as well as the 15v LNA's at 10amps.

Nice thing about using this 1U rack is that it fits neatly into my Rackmount rack.  Along with 4 Linux servers.  The RF Linux server will be running Ubuntu and provide desktop access.  There are also 3 other servers which will be used in a Hadoop-Cluster for data storage from the RF  server running 'rtl_power'.

More details will be forth-coming.  But for now this is what I can show you.

Thursday, February 13, 2014

New "Cluster" box

I found a place on Ebay that sells used rack mount computer servers and got one with 4 xeon CPU's.  Each CPPU has 4 cores to 16 cores total at 2.4Ghz.  The box has 24Gb Ram and can handle up to like 192Gb LOL.  It came with 3 ea. 160Gb 7200rpm drives.  Eventually I will probably replace those with 6 ea. 1Tb 7200rpm drives found some of those New on ebay for a reasonable price as well.

This box is pretty decent!  I spent $320 total on it.  Not bad really.  All things concidered.  I which is was 48Gb Ram though.  

I will probably get 2 more of these boxes from the same seller.

I was give 2 more servers that are 8Gb Ram, and 4 cores each.  They're not fancy, but any little bit helps in a Cluster like "Hadoop".  So...I'll take um.  

I'll snap some pix of this new box soon and post um here.

This box is going to go into the RTL SDR 0-1.7Ghz monitoring system I've previous posted here about.

Sunday, February 9, 2014

Example RF to Hadoop System

This image depicts a map of RF to Data storage I've been thinking about.  There are several missing items here that I just didn't include that I'll list here.  The idea is to use as few antennas as possible to do continuous scans from 0-1700Mhz using Realtek RTL TV dongles as SDR 'software defined radio' front ends.

Although the discone actually provides pretty decent results on most frequencies involved, I decided to make use of a folded dipole I have to improve gain resolution on the HF-63Mhz range.

Use of RF splitters means I can use fewer antennas, however there is some gain loss there which is then made up by the LNA's (Low Noise Amplifiers).  There LNA's can be bought reasonably cheap on Ebay.  High gain low noise are the best, however for this project an 'acceptable' NF is probably 1-3db max.

The LNA's are used to counter the losses in the RF Splitters, as well as provide some additional gain over the antenna's.  It's a cheap solution, not perfect, but really just fine for this project.

NOT SHOWN in this map are:

  • The use of FM Broadcast band 'band-stop' filters just before the left-most 0-200Mhz LNA and just before the top-level RF Splitter.  Due to some fairly local strong FM broadcast stations this is required to minimize overloading the RTL front ends.  Also missing from the map is the use of a High pass filter that starts at 45Mhz before the top-level RF Splitter.  This keeps strong HF signals from overloading the RTL's being used for the higher frequencies.  This became necessary after noticing around sundown how HF broadcast stations where overloading RTL's being used at VHF, UHF and beyond.
  • The GUI used to operate all of this.  Which currently does not completely exist.  Eventually this will employ a Web GUI to control all aspects of the system.  Including but not limited to:
    • MySQL administration
    • Apache Hadoop Administration
    • 'Gearmand' used to control many many things including RTL tuning among other things.
    • Generation of FFT display
    • Querying MySQL, Apache Hadoop, Hive, Impala, Log Data, and Data stored in Images which contain things like frequency, time, and power levels at very high resolution.

So far, using portions (most of the mapped out system above doesn't exist yet) of the system I've been able to observe the following:

  • Meteor Scatter.
  • Maximum Usable Frequency "MUF" (daily-diurnal, foEs).
  • Solar Flares.
  • HF Radio Blackouts related to Solar Flare events which correlate perfectly with these events.
  • Spectrum use, or lack thereof.
  • Low resolution scans (example: 0-63Mhz in 1Mhz resolution over 2 weeks @ 1 minute increments)
  • High resolutions scans (example: 1000Mhz to 1100Mhz over 12 hours in 1.3Khz resolutions in 1 minute increments)
This coming week I expect to receive 3 new rack mount Linux servers that will be used in a new small Hadoop Cluster here for massive data storage.  Once this is has been setup expect to be able to retain very large sets of high resolution data and be able to query it in reasonable human-usable times (seconds to minutes, instead of minutes to days).  Over the coming months I'll be working with the system and others doing similar work to improve on what I've laid out so far.  - Stay Tuned.