PodcastsEducationHacker Public Radio

Hacker Public Radio

Hacker Public Radio
Hacker Public Radio
Latest episode

267 episodes

  • Hacker Public Radio

    HPR4647: UNIX Curio #7 - Compression

    2026/05/26
    This show has been flagged as Clean by the host.

    This series is dedicated to exploring little-known—and occasionally useful—trinkets lurking in the dusty corners of UNIX-like operating systems.


    In UNIX Curio #4 (
    HPR episode 4617
    ), I teased the subject of file compression. Today I'm circling back to that.



    The history of data compression goes back at least to the 1970s, and in contexts outside UNIX and computers, probably even earlier. Somehow, it is refreshing to learn that humans have always struggled to have enough storage space to keep all the data they want to hang on to. One way around this limitation is to use some form of compression.



    I am only going to dive into
    lossless
    compression for this episode—that is, a compression method that can be reversed and will spit out the original data bit for bit. Lossy compression methods also have their places: you might be familiar with their use for audio (such as Ogg Vorbis or MP3); it's also used for images (such as JPEG). Lossy compression allows some of the original data to be thrown away, resulting in a smaller file than is possible with lossless compression, but the intent is for the result to still sound or look "good enough" to a human observer. Also, I am going to limit my discussion to generic methods used for many types of data; while FLAC does lossless compression, it is specifically designed just for audio.



    I should make clear that I have never studied computer science or information theory, so this episode will not get into the science behind various types of compression algorithms and how they differ. But in general, these methods take advantage of the fact that many types of data have recurring patterns. English text mostly consists of words that often re-appear many times—source code similarly has keywords and variable names that recur. Compression is accomplished by representing a piece of data that occurs multiple times with a symbol that is shorter in length.



    The first compression program in the UNIX world I could find is called


    pack




    , from 1978


    1
    . It was shortly followed in 1979 by a similar program called


    compact




    2
    . Both of these used a technique called Huffman coding, but with some differences between them. Files compressed with
    pack
    were given a
    .z
    extension and
    compact
    gave filenames a
    .C
    extension. Roughly every five or ten years after this, a new program would come along and achieve lasting popularity.



    There were, and still are, two opposing forces facing any new form of compression. Working in favor was the advantages it provided—first among these was achieving a better compression ratio, but performance improvements such as speed or reduced memory usage could also be compelling. The force against any new method was the fact that it was not yet widely supported—it doesn't much help to have a smaller file if the people you share it with cannot decompress it.



    The next major advance in compression arose out of three scientific papers: two in 1977 and 1978 by Abraham Lempel and Jacob Ziv (called LZ77 and LZ78), and one by Terry Welch in 1984 which built on LZ78. This last method is typically referred to as LZW. Our UNIX Curio for today is a
    program called




    compress




    3
    that implements the LZW method. Files compressed this way are named with the extension
    .Z
    . I had always assumed that this was to honor Jacob Ziv, but now that I've researched the history, it seems more likely to be a follow-on from how files compressed by
    pack
    were named. Since
    pack
    did not use any of the Lempel-Ziv methods, I would guess that it used
    .z
    because that wasn't already taken by anything else, but that's pure speculation.



    I do recall encountering
    .Z
    files in the wild, but feel certain that hasn't happened in the last 25 years, maybe longer. If you need to expand one of these,


    uncompress




    4
    is the program to use (
    GNU's




    gunzip




    can also handle them


    5
    ). However, there was a serious problem that arose with the LZ78 and LZW compression methods. Both of them were patented, and the owner became aggressive in seeking payment from developers and users. The
    compress
    utility was developed within two months of the publication of Welch's 1984 paper and was included in Bell Laboratories' Eighth Edition UNIX before these shakedowns started. The paper did not disclose that a patent had been filed, and apparently Spencer Thomas and the other developers of
    compress
    were unaware of it. The utility became popular for a while, and was even standardized by POSIX, but people moved away from LZW once the legal threats started.



    Another important advance came in 1991 and was called the DEFLATE compression method. It combined the un-patented LZ77 method with Huffman coding to achieve a similar level of compression as LZW (actually, often better) without the legal trouble. DEFLATE was developed for
    PKZIP
    and was soon adopted by the GNU project's
    gzip
    compressor. While Phil Katz (the "PK" in
    PKZIP
    ) patented one way of implementing the DEFLATE method,
    it was possible to write a compressor and decompressor without infringing


    6
    ; also, he apparently
    never tried to enforce the patent


    7
    .



    As I mentioned in UNIX Curio #4, .zip is both an archive
    and
    a compression format. Each archive member can be compressed with one of several possible methods (or stored without compression). Unlike a
    tar
    file where compression can be applied to the entire archive, in .zip each archive member is compressed individually. This often means a .zip file will be slightly bigger than a
    tar
    file with the same contents compressed with
    gzip
    , because the .zip format cannot take advantage of duplication that occurs among more than one member of the archive. The vast majority of .zip files use only the DEFLATE and uncompressed storage methods and these are the only options if you want to follow the profile standardized in ISO/IEC 21320-1. Actually, since they both use DEFLATE,
    gzip
    is able to extract a .zip file in the special case where it only holds one member compressed with that method.



    From the 1990s onward, people paid significant attention to avoiding patent landmines, so only methods that didn't have that problem became broadly popular. While the patents on LZ78 and LZW have since expired, I feel like their most successful legacy was in discouraging people from using those methods, leading to DEFLATE taking the popularity crown.



    The next step came in 1996 and 1997 with the development of
    bzip
    and
    bzip2
    by Julian Seward. The original method was quickly followed by
    bzip2
    , which was the version that achieved true popularity. They use the Burrows-Wheeler transform, which does not itself compress data but re-arranges it to make it more compressible;
    this is combined with other techniques


    8
    . (At least, that's my understanding. I told you, I'm not up on information theory.) This provides a significant reduction in the compressed size of the data compared to earlier methods—however, it is slower than DEFLATE both during compression and decompression.



    Separate projects have developed parallel versions of
    gzip
    and
    bzip2
    that can take advantage of multi-processor machines, but the original utilities run single-threaded.



    Another five years later, in 2001, Igor Pavlov added the Lempel-Ziv-Markov chain algorithm (LZMA), an enhancement to LZ77, to his 7-Zip compression tool. This was followed a few years later by LZMA2, a container format that allowed for LZMA compression to be split between multiple threads. Broad LZMA2 support came to the UNIX world in 2009 with the


    xz




    utility


    9
    . It offers roughly similar compression ratios to
    bzip2
    , though it can be better or worse depending on the data to be compressed. While compression generally takes even longer than
    bzip2
    , decompression is significantly faster (though still not as fast as
    gzip
    ). The Linux kernel relatively quickly supported
    booting from xz-compressed images


    10
    because it was a good match for that use case—compression, the time-consuming activity, only has to be done once while the more frequent decompression during boot happens relatively fast.



    The last method I will cover is
    Zstandard


    11
    , often written as
    zstd
    . This came about in 2015, and is another variation on LZ77 that uses finite-state entropy (which means nothing to me, but you might understand it). It performs about as well as DEFLATE in terms of compression ratios, but is much faster both when compressing and decompressing data. I should say that these statements are true with the typical default settings—depending on the compression level selected, it can compress more slowly, but compress the data smaller. However, decompression is always speedier than DEFLATE. This makes it attractive for some uses, and it is heavily promoted by Meta/Facebook, where Yann Collet developed it. For example, shipping large amounts of actively-used data between machines in a data center can go more quickly when the size is reduced; however, if the compression and decompression steps take too long that benefit is lost. A speedy method can be valuable even if it doesn't result in the greatest reduction in size. This use case stands in contrast to, say, a compressed backup file which might only be accessed in a disaster recovery scenario or never accessed at all, making size more important than speed.



    Both the
    xz
    and
    zstd
    utilities have some built-in support for multi-threading, but the default is to run in a single thread. While
    xz
    can use multiple threads for decompression (but only if the file was compressed in multi-thread mode), the reference
    zstd
    utility can only use more than one thread for compression, not decompression.



    There are many other methods of lossless compression that have been developed over the decades, but I believe these are the ones you are most likely to encounter in the world of UNIX-like systems. This is a personal opinion, and others might choose a different set. As mentioned, it can be tough for a new method to gain popularity and 35-year-old DEFLATE is still probably the most commonly used despite not being the fastest or offering the greatest reduction in size. Even systems like FreeBSD, NetBSD, and OpenBSD that do not like to include GNU tools supported it by developing their own version of
    gzip
    based on the permissively-licensed
    zlib
    library.



    Technically, the LZW method used by the
    compress
    utility is still standardized by POSIX, so one might expect it to have the widest support. However, aggressive patent enforcement discouraged adoption, especially by Free and Open Source Software systems—even though the patent has expired, it is still out of favor compared to DEFLATE. For this reason, I feel justified in calling it a curio.



    References:







    Eighth Edition UNIX pack.c
    https://www.tuhs.org/cgi-bin/utree.pl?file=V8/usr/src/cmd/pack/pack.c





    2.9BSD compact.c
    https://www.tuhs.org/cgi-bin/utree.pl?file=2.9BSD/usr/src/ucb/compact/compact.c





    Compress specification
    https://pubs.opengroup.org/onlinepubs/009695399/utilities/compress.html





    Uncompress specification
    https://pubs.opengroup.org/onlinepubs/009695399/utilities/uncompress.html





    GNU Gzip manual
    https://www.gnu.org/software/gzip/manual/gzip.html





    RFC 1951: DEFLATE Compressed Data Format Specification version 1.3
    https://tools.ietf.org/html/rfc1951





    History of Lossless Data Compression Algorithms: The Rise of Deflate
    https://ethw.org/History_of_Lossless_Data_Compression_Algorithms#The_Rise_of_Deflate





    bzip2
    https://en.wikipedia.org/wiki/Bzip2





    XZ Utils
    https://en.wikipedia.org/wiki/XZ_Utils





    2.6.38 merge window part 2
    https://lwn.net/Articles/423541/





    zstd
    https://en.wikipedia.org/wiki/Zstd







    Appendix




    The table below demonstrates the results of compressing different types of data using tools described in this episode. While not totally rigorous, I did run each compression and decompression multiple times to ensure I was getting consistent results. The laptop I used has an Intel Core i5-6200U CPU running at 2.30GHz, and the system had at least 5 GB of free memory for each run. While this processor has two cores and can run four simultaneous threads, all utilities were run single-threaded.



    The term "best" means the highest level of compression available (the exact level used is shown). For
    bzip2
    , the default
    is
    the best. For
    zstd
    , "best" is -19, which is the highest "normal" level, but "ultra" levels that are even higher also exist. Ratios are the percentage of the original size that the file was reduced to (other sources might instead express the compression ratio as the
    reduction
    in size achieved). In all results, smaller numbers are better.



    ┌────────────────────────────┬─────────────┬─────────────┬─────────────┬─────────────┬─────────────┬─────────────┬─────────────┐
    │ │ gzip │ gzip │ bzip2 │ xz │ xz │ zstd │ zstd │
    │ │(default -6) │ (best -9) │ (-9) │(default -6) │ (best -9) │(default -3) │ (best -19) │
    ├──────────────┬─────────────┼─────────────┼─────────────┼─────────────┼─────────────┼─────────────┼─────────────┼─────────────┤
    │ │Size (ratio) │ 22,036,508 │ 21,891,623 │ 15,795,698 │ 13,487,768 │ 12,938,464 │ 20,454,657 │ 13,709,078 │
    │ │ │ (24%) │ (24%) │ (17%) │ (15%) │ (14%) │ (23%) │ (15%) │
    │English Text ├─────────────┼─────────────┼─────────────┼─────────────┼─────────────┼─────────────┼─────────────┼─────────────┤
    │(90,532,092 │Compression │ 4.8s │ 7.6s │ 8.5s │ 49.8s │ 58.8s │ 0.6s │ 65.2s │
    │bytes │time │ │ │ │ │ │ │ │
    │uncompressed) ├─────────────┼─────────────┼─────────────┼─────────────┼─────────────┼─────────────┼─────────────┼─────────────┤
    │ │Decompression│ 0.7s │ 0.8s │ 3.7s │ 1.2s │ 1.2s │ 0.4s │ 0.4s │
    │ │time │ │ │ │ │ │ │ │
    ├──────────────┼─────────────┼─────────────┼─────────────┼─────────────┼─────────────┼─────────────┼─────────────┼─────────────┤
    │ │Size (ratio) │ 125,291,122 │ 124,189,544 │ 98,016,512 │ 84,882,492 │ 81,954,344 │ 120,604,855 │ 87,298,645 │
    │ │ │ (21%) │ (21%) │ (17%) │ (14%) │ (14%) │ (20%) │ (15%) │
    │Source Code ├─────────────┼─────────────┼─────────────┼─────────────┼─────────────┼─────────────┼─────────────┼─────────────┤
    │(590,008,320 │Compression │ 22.0s │ 39.3s │ 54.8s │ 241s │ 298s │ 3.7s │ 348s │
    │bytes │time │ │ │ │ │ │ │ │
    │uncompressed) ├─────────────┼─────────────┼─────────────┼─────────────┼─────────────┼─────────────┼─────────────┼─────────────┤
    │ │Decompression│ 5.1s │ 5.1s │ 20.3s │ 8.1s │ 7.8s │ 2.4s │ 2.4s │
    │ │time │ │ │ │ │ │ │ │
    ├──────────────┼─────────────┼─────────────┼─────────────┼─────────────┼─────────────┼─────────────┼─────────────┼─────────────┤
    │ │Size (ratio) │ 32,830,905 │ 32,371,241 │ 26,856,579 │ 20,717,288 │ 20,352,880 │ 28,538,810 │ 23,154,582 │
    │ │ │ (19%) │ (19%) │ (16%) │ (12%) │ (12%) │ (17%) │ (13%) │
    │Binary Program├─────────────┼─────────────┼─────────────┼─────────────┼─────────────┼─────────────┼─────────────┼─────────────┤
    │(171,972,264 │Compression │ 6.4s │ 22.4s │ 18.6s │ 62.2s │ 67.8s │ 0.8s │ 111s │
    │bytes │time │ │ │ │ │ │ │ │
    │uncompressed) ├─────────────┼─────────────┼─────────────┼─────────────┼─────────────┼─────────────┼─────────────┼─────────────┤
    │ │Decompression│ 1.5s │ 1.5s │ 5.6s │ 2.3s │ 2.3s │ 0.7s │ 0.7s │
    │ │time │ │ │ │ │ │ │ │
    ├──────────────┼─────────────┼─────────────┼─────────────┼─────────────┼─────────────┼─────────────┼─────────────┼─────────────┤
    │ │Size (ratio) │ 146,397,772 │ 146,397,757 │ 144,485,451 │ 131,950,232 │ 130,926,780 │ 147,154,979 │ 145,703,840 │
    │ │ │ (89%) │ (89%) │ (88%) │ (80%) │ (80%) │ (90%) │ (89%) │
    │WAVE Audio ├─────────────┼─────────────┼─────────────┼─────────────┼─────────────┼─────────────┼─────────────┼─────────────┤
    │(164,396,302 │Compression │ 9.2s │ 9.2s │ 25.1s │ 70.4s │ 97.7s │ 0.7s │ 58.3s │
    │bytes │time │ │ │ │ │ │ │ │
    │uncompressed) ├─────────────┼─────────────┼─────────────┼─────────────┼─────────────┼─────────────┼─────────────┼─────────────┤
    │ │Decompression│ 2.0s │ 2.0s │ 13.5s │ 12.2s │ 12.1s │ 0.6s │ 0.8s │
    │ │time │ │ │ │ │ │ │ │
    ├──────────────┴─────────────┼─────────────┼─────────────┼─────────────┼─────────────┼─────────────┼─────────────┼─────────────┤
    │ │ gzip │ gzip │ bzip2 │ xz │ xz │ zstd │ zstd │
    │ │(default -6) │ (best -9) │ (-9) │(default -6) │ (best -9) │(default -3) │ (best -19) │
    └────────────────────────────┴─────────────┴─────────────┴─────────────┴─────────────┴─────────────┴─────────────┴─────────────┘





    English text consists of Titles 1 through 10 of the 2020 U.S.
    Code of Federal Regulations
    .



    Source code consists of a
    tar
    file containing the Linux kernel source, version 4.0.



    Binary program consists of an ELF-format executable of the
    pandoc
    application, version 2.17.1.1 found on Debian 12.



    Audio consists of a 24-bit Signed Integer PCM WAVE file with 2 channels at 44.1kHz, about 10:21 in length. For comparison, the audio-specific
    flac
    lossless compression utility reduced this file to 97,962,711 bytes (60%) in 2.6 seconds at the default (-5) level and to 97,714,876 bytes (59%) in 5.4 seconds at the highest (-8) level.







    Provide feedback on this episode.
  • Hacker Public Radio

    HPR4646: Mobile Gaming

    2026/05/25
    This show has been flagged as Explicit by the host.



    Games Mentioned



    Tile Survive - https://tilesurvivegame.com/en

    Monopoly Go - https://apps.apple.com/us/app/monopoly-go/id1621328561

    Arrows Go - https://apps.apple.com/us/app/arrows-go-arrow-puzzles/id6758326278



    Apologies, Lee's mic is a bit metallic due to filtering out background noise from a fan


    Provide feedback on this episode.
  • Hacker Public Radio

    HPR4645: ZERO HOUR: FRIDAY AFTERNOON APK HACKING

    2026/05/22
    This show has been flagged as Explicit by the host.


    WARNING AI GENERATED NOTES AHEAD YMMW








    Here is a summary of the recorded training session regarding Android hacking from Hacker Public Radio, including web references for the main topics discussed.








    Overview




    The recording features a security consultant performing a live assessment of an Android application. The consultant uses a custom tool suite called "Jamboree" and various other utilities to test a location-sharing and vehicle management app. The session highlights the increasing complexity of mobile app security, specifically dealing with SSL pinning, encrypted traffic, and anti-tampering mechanisms

    1

    .




    Environment and Tools




    The assessment is conducted on a rooted Android emulator. The speaker utilizes several tools to set up the environment and intercept traffic:







    Jamboree

    : A custom automation tool developed by the speaker over six years to handle rooting, proxy setup, and app installation within minutes

    1

    .





    Burp Suite

    : The primary interception proxy used to analyze traffic between the app and the production server

    1

    .





    Frida

    : Used to bypass anti-root detection and SSL pinning

    1

    .





    Ghidra

    : A decompiler used to analyze the app's code, specifically helpful for patching the Flutter-based application

    1

    .





    Android Debug Bridge (ADB)

    : Used for troubleshooting, debugging, and analyzing logs (

    logcat

    ) to extract user IDs and location data

    1

    .






    Technical Challenges: SSL Pinning and Flutter




    The target application is built using Flutter and implements rigorous security controls, including SSL pinning, which prevents standard Man-in-the-Middle (MitM) attacks. The app's HTTP client ignores system and user-installed certificates, and it does not respect device Wi-Fi proxy settings

    1

    .




    To overcome this:







    Traffic Redirection

    : The speaker uses

    iptables

    commands to force all HTTP and HTTPS traffic through the proxy's IP address at the network layer, bypassing the app's proxy ignorance

    1

    .





    Patching with AI

    : The speaker leverages AI (specifically mentioning Claude and access to "Kuro") to assist in patching the APK. The AI helped navigate Ghidra and generate Python scripts to bypass the app's protections, allowing the modified APK to trust the auditor's certificate

    1

    .





    Frida Scripts

    : "Frida anti-root SSL pinning" scripts are executed to further mitigate detection mechanisms

    1

    .






    Key Vulnerabilities Identified





    1. Geolocation Spoofing





    The consultant successfully spoofed the device's GPS location using emulator settings (e.g., setting the location to Puerto Rico or Costa Rica). The application accepted this falsified location data as valid, indicating a lack of server-side verification for location origin

    1

    .





    2. Insecure Direct Object Reference (IDOR) / Broken Access Control





    The most critical finding involves the app's user tracking feature.






    The consultant discovered that the API allows querying a user's location via a

    user_id

    .




    By intercepting traffic and analyzing

    adb logcat

    logs, the consultant extracted their own

    user_id

    and the

    user_id

    of a second test account

    1

    .




    While authenticated as one user, the consultant was able to send a request substituting the

    user_id

    with the target's ID. The server responded with the target's GPS coordinates. This confirms that an authenticated user can track any other user's real-time location if they possess the target's ID

    1

    .




    Proof of concept was created by copying the request as a

    curl

    command to demonstrate the exploit

    1

    .







    3. Potential Information Disclosure





    The consultant began testing a feature that allows users to add vehicles by license plate. The concern is that querying a license plate might return excessive PII (Personally Identifiable Information), such as VIN numbers or registration details, beyond what the UI strictly requires (least privilege issue)

    1

    .





    4. Access Control (Calendar Feature)





    The consultant tested whether calendar events could be accessed by switching

    user_id

    parameters. This test resulted in a "401 Unauthorized" error, indicating that this specific endpoint had proper access control in place

    1

    .




    Web References and Resources




    Below are references for the main tools and concepts discussed in the training:







    Hacker Public Radio

    :

    https://hackerpublicradio.org/






    Burp Suite (Web Security Testing)

    :

    https://portswigger.net/burp






    Frida (Dynamic Instrumentation Toolkit)

    :

    https://frida.re/






    Ghidra (Software Reverse Engineering)

    :

    https://ghidra-sre.org/






    Android Debug Bridge (ADB)

    :

    https://developer.android.com/tools/adb






    OWASP Mobile Top 10

    :

    https://owasp.org/www-project-mobile-top-10/






    OWASP Testing for Insecure Direct Object References (IDOR)

    :

    https://owasp.org/www-project-web-security-testing-guide/v42/4-Web_Application_Security_Testing/04-Authorization_Testing/04.1-Testing_for_Insecure_Direct_Object_References






    Flutter (UI Toolkit)

    :

    https://flutter.dev/








    Provide feedback on this episode.
  • Hacker Public Radio

    HPR4644: Response to comments on HPR4424: Newsboat...

    2026/05/21
    This show has been flagged as Clean by the host.


    Hi this is your host, Archer72 for Hacker Public Radio.






    In this episode I share some of my findings about a problem with
    the Newsboat naming of the HPR feeds,


    which was brought up in comments about my Newsboat show, HPR4424.







    hpr4424: How I use Newsboat for Podcasts: comment #6 :
    download-filename-format for HPR podcasts






    Ken already had some findings of his own about the
    ccdn.php extension in the feed.







    hpr4424: comment #10 : Summary of findings






    I thought that this might be able to be fixed on an invididual
    basis, and set out to ask Claude.ai a few questions.






    But first, some colaboration from Dave Morriss about a good
    renaming format. This was definitely more on Dave’s side than
    mine, but came up with this.


    You can tell Dave’s handywork from the short variable names, which
    stems from his extensive experience on Unix type machines in the
    University days.






    exif-rename-hpr-dave.sh





    #!/bin/bash

    URL="$(cat /tmp/hpr-url.txt)"
    echo "DEBUG URL: $URL" >> /tmp/hpr-debug.log

    AUDIO_URL="$(curl -s "$URL" | grep -Eo 'https?://[^"]*\.(ogg|mp3)' | head -1)"
    echo "DEBUG AUDIO: $AUDIO_URL" >> /tmp/hpr-debug.log

    if [[ -z "$AUDIO_URL" ]]; then
    echo "ERROR: Could not find audio URL from: $URL" >> /tmp/hpr-debug.log
    exit 1
    fi

    # Changed destination to HPR-queue
    DEST=~/podcasts/hub.hackerpublicradio.org/HPR-queue/

    # Record files present before download
    BEFORE="$(ls "$DEST"*.{ogg,mp3} 2>/dev/null | sort)"

    wget -nc --content-disposition -P "$DEST" "$AUDIO_URL"

    cd "$DEST"

    # Record filename just downloaded (new file not in BEFORE)
    AFTER="$(ls "$DEST"*.{ogg,mp3} 2>/dev/null | sort)"
    DOWNLOADED="$(comm -13 <(echo "$BEFORE") <(echo "$AFTER"))"
    echo "DEBUG DOWNLOADED: $DOWNLOADED" >> /tmp/hpr-debug.log

    ~/bin/exif-rename-hpr-dave.sh

    # Find renamed file — newest file that wasn't in BEFORE
    AFTER_RENAME="$(ls "$DEST"*.{ogg,mp3} 2>/dev/null | sort)"
    RENAMED="$(comm -13 <(echo "$BEFORE") <(echo "$AFTER_RENAME"))"
    echo "DEBUG RENAMED: $RENAMED" >> /tmp/hpr-debug.log

    if [[ -n "$RENAMED" ]]; then
    echo "\"$AUDIO_URL\" \"$RENAMED\" downloaded" >> ~/.local/share/newsboat/queue
    else
    echo "WARN: Could not determine renamed file" >> /tmp/hpr-debug.log
    echo "\"$AUDIO_URL\" \"$DOWNLOADED\" downloaded" >> ~/.local/share/newsboat/queue
    fi





    At first the question was about something simple. The input was a
    query on one of the lines from Kevie’s







    hpr4398 :: Command line fun: downloading a podcast






    Particularly, the section on To get the latest episode of TuxJam






    wget
    curl https://tuxjam.otherside.network/feed/podcast/ | grep -o
    'https*://[^"]*ogg' | head -1






    Which I re-wrote to:





    wget -nc -P ~/podcasts/TuxJam $(curl https://tuxjam.otherside.network/feed/ogg | grep -Eo 'https*://[^"]*ogg' | sort -u | xargs | head -1)





    The reason for $() instead of backticks to enclose a command was
    that the former was being deprecated.







    GNU Bash Reference Manual - 3.5.4 Command Substitution






    -nc –no-clobber is to not re-download a podcast -P specifies
    download directory






    I went on a different direction than downloading TuxJam and asked
    to download the last 2 hpr shows, but
    head -2 did not work as expected. This turned out to be
    an issue with the placement of
    xargs joining all URLs and passing them to
    wget all at once.






    Original:





    wget -nc --content-disposition -P ~/podcasts/hub.hackerpublicradio.org/HPR-newsboat-test/ $(curl [http://hackerpublicradio.org/hpr\_ogg\_rss.php](http://hackerpublicradio.org/hpr_ogg_rss.php) | grep -Eo 'https\*://\[^"\]\*ogg' | sort -u | xargs | head -2)





    New:





    curl http://hackerpublicradio.org/hpr_ogg_rss.php | grep -Eo 'https?://[^"]*\.ogg' | sort -u | head -2 | xargs wget -nc --content-disposition -P ~/podcasts/hub.hackerpublicradio.org/HPR-newsboat-test/


    Key fixes:







    Removed the premature
    xargs before
    head -2
    , so
    head actually limits the list to 2 URLs



    Moved
    xargs wget ... to the end, after the list is already
    trimmed



    Cleaned up
    https* →
    https? (the original would also match
    httpssss
    , etc.)







    Now I wanted the downloaded file to go the the queue file, located
    in
    ~/.local/share/newsboat/queue



    After several iterations in
    Claude
    , it was determined that the audio URL was not being expanded by
    the
    %u in the download macro.






    The solution was to add a /tmp file to hold the actual audio URL





    macro d set browser "echo %u > /tmp/hpr-url.txt && ~/bin/download-and-rename-hpr.sh"; open-in-browser ; set browser "your-normal-browser"





    A few Claude questions later… After asking to add a function to
    the macro in order to add the resulting downloaded file to the
    queue, the file name hpr1234.ogg was being added to the queue
    instead of the renamed file.






    Two things to fix:







    The renamed file isn’t being found because
    ls -t runs before the rename has fully settled, or the
    glob isn’t matching the new filename format (which includes
    spaces and semicolons)



    The existing queue entries show the correct format:
    "url" "path" downloaded — we need to match that, with
    downloaded status and the full renamed path including
    extension







    Exif rename script





    #!/bin/bash
    # ~/bin/download-and-rename-hpr.sh

    URL="$(cat /tmp/hpr-url.txt)"
    echo "DEBUG URL: $URL" >> /tmp/hpr-debug.log

    AUDIO_URL="$(curl -s "$URL" | grep -Eo 'https?://[^"]*\.(ogg|mp3)' | head -1)"
    echo "DEBUG AUDIO: $AUDIO_URL" >> /tmp/hpr-debug.log

    if [[ -z "$AUDIO_URL" ]]; then
    echo "ERROR: Could not find audio URL from: $URL" >> /tmp/hpr-debug.log
    exit 1
    fi

    DEST=~/podcasts/hub.hackerpublicradio.org/HPR-newsboat-test/

    # Record files present before download
    BEFORE="$(ls "$DEST"*.{ogg,mp3} 2>/dev/null | sort)"

    wget -nc --content-disposition -P "$DEST" "$AUDIO_URL"

    cd "$DEST"

    # Record filename just downloaded (new file not in BEFORE)
    AFTER="$(ls "$DEST"*.{ogg,mp3} 2>/dev/null | sort)"
    DOWNLOADED="$(comm -13 <(echo "$BEFORE") <(echo "$AFTER"))"
    echo "DEBUG DOWNLOADED: $DOWNLOADED" >> /tmp/hpr-debug.log

    ~/bin/exif-rename-hpr-dave.sh

    # Find renamed file — newest file that wasn't in BEFORE
    AFTER_RENAME="$(ls "$DEST"*.{ogg,mp3} 2>/dev/null | sort)"
    RENAMED="$(comm -13 <(echo "$BEFORE") <(echo "$AFTER_RENAME"))"
    echo "DEBUG RENAMED: $RENAMED" >> /tmp/hpr-debug.log

    if [[ -n "$RENAMED" ]]; then
    echo "\"$AUDIO_URL\" \"$RENAMED\" downloaded" >> ~/.local/share/newsboat/queue
    else
    echo "WARN: Could not determine renamed file" >> /tmp/hpr-debug.log
    echo "\"$AUDIO_URL\" \"$DOWNLOADED\" downloaded" >> ~/.local/share/newsboat/queue
    fi





    Using
    comm -13 on sorted before/after file lists is more
    reliable than
    ls -t — it finds exactly which file is new regardless of
    timing. Check
    /tmp/hpr-debug.log after the next run to see what
    DOWNLOADED and
    RENAMED show.







    GNU Coreutils - Comm invocation






    Newsboat config snippet






    key changes





    download-filename-format "%t.%e"
    macro d set browser "echo %u > /tmp/hpr-url.txt && ~/bin/download-and-rename-hpr.sh"; open-in-browser ; set browser lynx



    Provide feedback on this episode.
  • Hacker Public Radio

    HPR4643: HPR Beer Garden 13 - Triple IPA

    2026/05/20
    This show has been flagged as Clean by the host.

    Dave and Kevie continue the Beer Garden series with a look at a relatively new style of IPA known as a Triple. Dave samples
    Triple Hazy Jane by Brewdog
    , whilst Kevie tries out
    Glory Triple IPA
    by
    Northern Monk
    .













    Connect with the guys on Untappd:











    Dave






    Kevie




    Provide feedback on this episode.
More Education podcasts
About Hacker Public Radio
Hacker Public Radio is an podcast that releases shows every weekday Monday through Friday. Our shows are produced by the community (you) and can be on any topic that are of interest to hackers and hobbyists.
Podcast website

Listen to Hacker Public Radio, Motivation Daily by Motiversity and many other podcasts from around the world with the radio.net app

Get the free radio.net app

  • Stations and podcasts to bookmark
  • Stream via Wi-Fi or Bluetooth
  • Supports Carplay & Android Auto
  • Many other app features