Establish a Red Hen capture station in Portugal.
Note: instructions on this page are based on instructions provided on the page for Slick Capture.
We now have a fully automated capture system based on a tiny $40 Raspberry Pi -- a credit-card-sized motherboard:
For the executive version, see How to set up a Red Hen capture station.
$ sudo mount /dev/sda1 /mnt/HD1
See discussion at Advanced Users. It may be worth moving the system logs to a USB drive to spare the SD card.
If system logs are moved to USB, we may be able to physically set the little tab on the SD card to a locked position to prevent any changes, as a security measure; see Protect your Raspberry Pi SD card with a read-only file system.
Backup of the SD Card, according to (https://www.raspberrypi.org/documentation/linux/filesystem/backup.md):
Or with a better view of the process, using pipe viewer:
sudo pv -tpreb /dev/mmcblk0 -s 30908350464 |\ dd of=/mnt/HD1/RPi_backups/backup_fenix_20151212_230304.img bs=1M conv=sync,noerror iflag=fullblock
-- where 30908350464 is the size of the SD card. This is implemented in a community-developed backup script, called sdbackup.sh on fenix, which must be run as user root:
Restore of the backup, according to (https://www.raspberrypi.org/documentation/linux/filesystem/backup.md):
Before we can automate the capture process, we need to configure the operating system and install the Red Hen capture scripts.
By Francis Steen
A Red Hen capture station uses a customized environment of Debian Linux.
Create the following list of repositories with the command "sudo nano /etc/apt/sources.list":
# RPI Sources.list 2015-06-20
# To sign keys --
# gpg --keyserver pgp.mit.edu --recv-keys $KEY
# gpg --keyserver subkeys.pgp.net --recv-keys $KEY
# gpg --armor --export $KEY | apt-key add -
# Raspbian Stable
#deb http://mirrordirector.raspbian.org/raspbian/ jessie main contrib non-free rpi
deb http://mirror.ox.ac.uk/sites/archive.raspbian.org/archive/raspbian/ jessie main contrib non-free rpi
# Uncomment line below then 'apt-get update' to enable 'apt-get source'
deb-src http://archive.raspbian.org/raspbian/ jessie main contrib non-free rpi
# Raspbian Testing
#deb http://mirrordirector.raspbian.org/raspbian/ stretch main contrib non-free rpi
deb http://mirror.ox.ac.uk/sites/archive.raspbian.org/archive/raspbian/ stretch main contrib non-free rpi
deb-src http://archive.raspbian.org/raspbian/ stretch main contrib non-free rpi
# Debian Multimedia (http://www.deb-multimedia.org/) -- unstable used for handbrake-cli 2015-11-24
deb http://www.deb-multimedia.org jessie main non-free
#deb http://www.deb-multimedia.org stretch main non-free
#deb http://www.deb-multimedia.org sid main non-free
Since we need a few packages from testing (currently stretch), let's set pinning values in /etc/apt/preferences:
# Pinning values: see https://wiki.debian.org/AptPreferences
# apt-get install xmltv/unstable -- only the package, not the dependencies
# apt-get -t unstable install xmltv -- also the dependencies
# apt-cache policy -- inspect priorities as set here
# This sets the priorities correctly for stable (jessie), but not for testing and unstable.
# The order is critical -- FIXME!
Pin: release a=unstable
Pin: release a=testing
Pin: origin www.deb-multimedia.org
Pin: origin http.debian.net
Pin: release a=stable
For the upgrade to Raspberry Pi 3B, add
Install the appropriate locales, including en_US.UTF-8, en_US, and pt_PT.UTF-8. Set default locale to en_US.UTF-8.
For the exim4 mail transport agent (mta), define a fully qualified domain name, e.g., fenix.universia.pt, in /etc/mailname. Also add this line to /etc/hosts (fully qualified domain name, used by alpine):
Configure the local timezone with tzselect, if that wasn't done on installation—Red Hen capture stations should use the local timezone, not UTC. See detailed instructions. Create local user csa:csa and add this to ~/.profile:
Since the recording system may be unattended for long periods, also set the system time zone in the OS:
Configure the network time protocol daemon -- queries atomic clocks on the web:
Note that Raspberry Pis do not have a hardware clock, so each time it is rebooted, it needs to read the time from the Internet.
NTP uses port 123 to communicate with time servers. Where this port is blocked, add this line to root's crontab to set the time at some interval, such as once an hour:
Alternatively, use dvbdate from dvb-tools to set the time (untested). This is the preferred solution, since it's not dependent on a live Internet connection.
Finally, make sure you restart the cron daemon, so that it will use the new timezone:
If you forget, the crontabs of the recording schedule will use the previous timezone.
Create the file /etc/redhen.config with these local parameter values:
In /etc/lightdm/lightdm.conf, set
and add this:
Because fenix filters the VNC ports, we will need to use VNC tunnelling. If the Raspberry Pi is located behind a firewall that filters all ports, we will need to use ssh tunneling; a good solution is autossh (see instructions):
autossh -M 20000 -f -N cartago -R 1234:localhost:22 -C
Install and configure fail2ban:
Turn off password login (/etc/ssh/sshd_config) to secure the system. Verify with nmap that no unnecessary ports are open; if you run VNC, shut it down between use. Unless you need an ssh tunnel, do not allow any outbound ssh connections. Change /etc/motd to a welcome message, say
Welcome to Red Hen Lab's Raspberry Pi 2B recording station at the Instituto Politécnico da Guarda in Portugal
Download and compile dvb-dvbt-ts, a perl utility for repairing television transport stream files:
Install the resulting .deb file.
The HDHomerun device we are using with fenix is HDHR4-2DT, model hdhomerun4_dvbt. We've installed both the drivers provided by Silicon Dust, and I also installed the native Linux driver dvbhdhomerun. This involves building an out-of-tree kernel module, dvb_hdhomerun. I discovered we are running the 4.1.13-v7+ kernel, not sure why or how we got such a recent kernel. I couldn't find the source, and then found rpi-update, which is used by developers to test new kernels and firmware. Before I realized this was not intended for regular users, I ran rpi-update, which may have updated the firmware and reinstalled the same 4.1.13-v7+ kernel. I still couldn't find the source package, but then found the headers. Installing it prompted for the installation of gcc-4.7, which reassuringly installed from raspbian jessie. The packages then installed and built the new module:
just install cmake libhdhomerun-dev dkms dh-systemd module-assistant git clone https://github.com/h0tw1r3/dvbhdhomerun cd dvbhdhomerun && dpkg-buildpackage -b just install dvbhdhomerun-dkms_0.0.16+nmu3_all.deb just install dvbhdhomerun-utils_0.0.16+nmu3_armhf.debThe dvbhomerun provides drivers for the HDHomerun devices that are fully integrated into the Linux digital television framework. This means we can use the usual Linux tools such as w_scan and gnutv -- tools we've used for years at the capture stations at UCLA. Importantly, we may also be able to use tvheadend, which provides a web-based scheduling tool that connects with the xmltv EPG. If the driver fails, we can remove the packages and maybe downgrade the kernel.
Raspbian comes with an LXDE menu; to customize, use the menu editor and see Adding LXDE start menu entries and desktop shortcuts. Menu entries are defined in .desktop file located in /usr/share/applications/usr/share/applications; they can be added and edited. For Raspberry Pi 3B (see updates), we cloned the fenix image and added the Bluetooth plugin for the lxpanel taskbar by right-clicking on the taskbar and following the menu selections. You can then click on the Bluetooth icon in the taskbar, make the RPi discoverable, scan for devices, and pair them. Once paired, the devices should automatically reattach on reboot.
If the devices do not reattach on reboot, or if you want access to the GUI from another computer for any reason, you can run VNC on the RPi by issuing
vncserver -geometry 800x600 -depth 24 :1
On your laptop, to create a safe connection, forward the VNC port for screen 1:
ssh fenix.local -nNT -L 5991:localhost:5901 -l csa
You can then connect to the Raspberry Pi desktop with this command on a Mac:
or run any VNC client.
Set up the HDHomeRun
By Jose Fonseca, Polytechnic Institute of Guarda, Portugal
From a terminal window connected to the Raspberry Pi download the file http://download.silicondust.com/hdhomerun/libhdhomerun_20150615.tgz
$ wget http://download.silicondust.com/hdhomerun/libhdhomerun_20150615.tgz
$ tar -xvzf libhdhomerun_20150615.tgz
From a terminal window connected to the Raspberry Pi download the file http://download.silicondust.com/hdhomerun/hdhomerun_config_gui_20150615.tgz
$ wget http://download.silicondust.com/hdhomerun/hdhomerun_config_gui_20150615.tgz
$ tar -xvzf hdhomerun_config_gui_20150615.tgz
Install the libgtk library needed to install the HDHomeRun $ sudo apt-get install libgtk2.0-dev
sudo make install
Search for the HDHomeRun on the subnet
Scan tuner 0 for channels
$ hdhomerun_config $DEVICE set /tuner$TUNER/channel auto:56
$ hdhomerun_config $DEVICE get /tuner$TUNER/status
by Francis Steen
I've added the Red Hen scripts and directories, as follows:
I built ccextractor-0.78 and copied these scripts to /local/user/bin:
The recording system uses crontabs for scheduling. The syntax is as follows (see channel -h):
channel 1, 30min, "Program name", 1
where channel is the name of the script, 1 is the RTP-1 program 1101, "Program name" you fill in, and 1 is just a marker for our online schedule validator, which we may or may not end up activating.
Since /usr/local/bin/channel just links to the real script, you can do this to see the details:
bash -xv channel_hdhr_2015-11-06.sh 1, 2min, TEST, 1
Files are recorded to /mnt/spool. Ideally, this would be an external harddrive, to remove some of the load on the solid-state drive. It could even be a 7200rpm externally powered drive that would be fast enough to handle two simultaneous recordings, though this is not of first importance.
A crontab entry might look like this:
1 3 17 6 * channel 58, 60min, "US Presidential Politics", 3, "Donald Trump Presidential Campaign Announcement"
I created a crontab for user csa with a single-run test example.
When it ran, it also generated an e-mail receipt:
You can see these e-mails in the alpine mail reader. So far, everything seems to be working, so if you could schedule some actual news programs, that would be great.
The file that ends in .t is a header file, used to construct the eventual teletext file, but I've not set that up yet.
You can move into the directory tree /tv that contains the recorded files by typing
for today, or
for yesterday, and so on going backwards in time. The directories are created automatically.
The list of Portuguese channels is:
PROGRAM 1101: 1 RTP 1
The programs are all on eu-bcast:56. Add this information to the parameter file /tvspare/tuners/lineup, along with the network ID (from the xmltv downloads in /tvspare/xmltv), the country, the language, and the most-used teletext page number:
The lineup file is used as a lookup table by the various scripts that need this information.
The three main channels (RTP, public and SIC and TVI, private channels) broadcast their evening news at 8pm. Viewers shares are highest for SIC, then TVI and the least seen is RTP (http://binaries.cdn.impresa.pt/dealer/2246924/AUDIENCIAS-Abril+20144205681231226447556.pdf)
Here's the link to the schedules:
SIC - http://sicnoticias.sapo.pt/programas/jornaldanoite/
There is a show that might be of interest. Every week they choose a news topic and invite politicians, experts and other relevant players to discuss it. Prós & Contras: http://www.rtp.pt/play/p1772/Pros-e-Contras
According to Wikipedia's list of networks with teletext, these three networks all have teletext numbers:
Sociedade Independente de Comunicação: teletext page 888 for live captions—online teletext
After completing the operating system and tv capture configurations, the recording schedule can be automated through crontab, as in this example for user pi:
The full crontab looks like this:
The script xmltv-download.sh daily fetches the broadcast schedule for the networks we're interested in. In honor of the new recording location, we wrote a new script, scheduler.sh, which examines the downloaded schedule for shows we record and automatically generates the crontab recording schedule, see Automatic scheduling.
For the entire process to be automated, we also need the scripts that extract the text, compress the video, and copy the completed files to NewsScape, as described in the sections below.
The Portuguese television transport stream typically contains a timestamped transcript of the news show. We use CCExtractor to extract these subtitles. For how to retrieve and compile CCExtractor, see Brazil Capture Station.
The list of the teletext page of the transcription of the main news programs is the following (see RTP teletext):
Note that different types of programs use different teletext pages -- in the case of RTP:
To look for the embedded teletext, try these commands:
You may get teletext, or no text but a block of suggested teletext pages to check:
To check each of these, issue this command on one line:
Once you do find text, add the teletext page you identified to the cc-extract-teletext.sh script and run it:
Our first results for SIC teletext page 888 are excellent, at least for the telenovella A Regra do Jogo:
The RTP-1 broadcasts Telejornal, its main evening news show, at 20:00, with excellent teletext captions. We now record this show on a regular schedule, tracked by the new schedule script.
We currently transfer the uncompressed files to the Hoffman2 high-performance computing cluster at UCLA, which processes the files and sends them to the NewsScape search engines and archival servers. However, in some cases it may be necessary or desirable to perform the compression locally, either with software codecs or with the hardware codec built into the Raspberry Pi.
Video compression is done with HandBrake or ffmpeg. Get HandBrake with NEON support:
It's supposed to encode SD video faster than real time.
The current script uses ffmpeg, with HandBrake as a fallback if ffmpeg fails. The process is scripted:
On a Raspberry Pi 3B, such as odin, software compression takes around twice the length of the recording. This is fast enough to be a realistic option. On a Raspberry Pi 2B, such as redhen3rpi, software compression takes 4 hours and 39 minutes to processes a one-hour 981MB mpeg file; this is typically too slow to be useful.
The Raspberry Pi also has a hardware compression chip. To support it, build a custom version of ffmpeg:
You can then use the ts2mp4-single-02.sh script as above and encode a one-hour video in about 20 minutes. The quality is decent, but not as good as software-encoded files.
are now available to the UCLA campus and the Red Hen community of students and researchers. In the NewsScape search engine, the timestamped teletext is used as a search index and a navigational framework for the news videos:
Machine translation within the search engine:
Preparing a custom image
After creating a backup image of fenix -- see Backup and restore above and MicroSD backup -- you can mount the image to inspect and modify it. Since the full image contains two active partitions, you need to mount one partition at a time. To see the partitions:
You can now create custom mount points and mount the main partition like this:
This makes each partition of the backup image available for editing. You can mount the image partition on any Linux machine and edit the text files -- for instance, change the host name, or create a new /etc/redhen.config file.
You can also create a virtual fenix by changing root to the image:
Now you can even run programs within the image, such as generating new rsa keys. Note that you can only do this on a Raspberry Pi -- on fenix itself or another RPi -- because the binary executables have been compiled for the ARM CPU. These commands allow us to take a full backup of fenix, and then customize the image for a new user. Some values you may want to customize:
Information not known beforehand can be added once the new unit is on site.
Change the passwords for all users as user root, using
To create new RSA keys, first recreate some needed devices as root:
This makes the distribution of images safer.
@redhenlab > d. The Cognitive Core: Research Topics in Red Hen > The Barnyard of Possible Specific Projects > Completed projects >