Sunday, February 9, 2014

Example RF to Hadoop System

This image depicts a map of RF to Data storage I've been thinking about.  There are several missing items here that I just didn't include that I'll list here.  The idea is to use as few antennas as possible to do continuous scans from 0-1700Mhz using Realtek RTL TV dongles as SDR 'software defined radio' front ends.

Although the discone actually provides pretty decent results on most frequencies involved, I decided to make use of a folded dipole I have to improve gain resolution on the HF-63Mhz range.

Use of RF splitters means I can use fewer antennas, however there is some gain loss there which is then made up by the LNA's (Low Noise Amplifiers).  There LNA's can be bought reasonably cheap on Ebay.  High gain low noise are the best, however for this project an 'acceptable' NF is probably 1-3db max.

The LNA's are used to counter the losses in the RF Splitters, as well as provide some additional gain over the antenna's.  It's a cheap solution, not perfect, but really just fine for this project.

NOT SHOWN in this map are:

  • The use of FM Broadcast band 'band-stop' filters just before the left-most 0-200Mhz LNA and just before the top-level RF Splitter.  Due to some fairly local strong FM broadcast stations this is required to minimize overloading the RTL front ends.  Also missing from the map is the use of a High pass filter that starts at 45Mhz before the top-level RF Splitter.  This keeps strong HF signals from overloading the RTL's being used for the higher frequencies.  This became necessary after noticing around sundown how HF broadcast stations where overloading RTL's being used at VHF, UHF and beyond.
  • The GUI used to operate all of this.  Which currently does not completely exist.  Eventually this will employ a Web GUI to control all aspects of the system.  Including but not limited to:
    • MySQL administration
    • Apache Hadoop Administration
    • 'Gearmand' used to control many many things including RTL tuning among other things.
    • Generation of FFT display
    • Querying MySQL, Apache Hadoop, Hive, Impala, Log Data, and Data stored in Images which contain things like frequency, time, and power levels at very high resolution.

So far, using portions (most of the mapped out system above doesn't exist yet) of the system I've been able to observe the following:

  • Meteor Scatter.
  • Maximum Usable Frequency "MUF" (daily-diurnal, foEs).
  • Solar Flares.
  • HF Radio Blackouts related to Solar Flare events which correlate perfectly with these events.
  • Spectrum use, or lack thereof.
  • Low resolution scans (example: 0-63Mhz in 1Mhz resolution over 2 weeks @ 1 minute increments)
  • High resolutions scans (example: 1000Mhz to 1100Mhz over 12 hours in 1.3Khz resolutions in 1 minute increments)
This coming week I expect to receive 3 new rack mount Linux servers that will be used in a new small Hadoop Cluster here for massive data storage.  Once this is has been setup expect to be able to retain very large sets of high resolution data and be able to query it in reasonable human-usable times (seconds to minutes, instead of minutes to days).  Over the coming months I'll be working with the system and others doing similar work to improve on what I've laid out so far.  - Stay Tuned.


  1. Also something I just found that's interesting and I may work to include (at least something like it):

  2. BTW these hubs are POWERED-HUBS. You want to make sure you only use high-end powered hubs for so many RTL's otherwise you'll overload the USB power limits on the mother board with so many. What results when you do this, are unstable RTL's, crashing rtl_power runs, and/or misc. giberish from the the RTL via rtl_power.

    Not only is this important from the above standpoint, but also using a powered hub allows you to use a power source that is much cleaner than the onboard power supplies in most computers which can introduce a fair amount of noise in your data.