Fast Elixir RSS feed parser, a NIF wrapper around the Rust RSS crate


Keywords
elixir, feeds, rss, rust
License
Other

Documentation

FastRSS

Parse RSS feeds very quickly

Hex.pm Hex.pm Hex.pm HexDocs.pm last commit

Intro | Compatibility | Installation | Usage | Benchmarks | Deploying | License


Intro

Parse RSS feeds very quickly

  • This is rust NIF built using rustler
  • Uses the RSS rust crate to do the actual RSS parsing

Speed

Currently this is already much faster than most of the pure elixir/erlang packages out there. In benchmarks there are speed improvements anywhere between 6.12x - 50.09x over the next fastest package (feeder_ex) that was tested.

Compared to the slowest elixir options tested (feed_raptor, elixir_feed_parser), FastRSS was sometimes 259.91x faster and used 5,412,308.17x less memory (0.00156 MB vs 8423.70 MB).

See full benchmarks below:

Compatibility

FastRSS requires a minimum combination of Elixir 1.6.0 and Erlang/OTP 20.0, and is tested with a maximum combination of Elixir 1.11.1 and Erlang/OTP 22.0.

Installation

This package is available on hex.

It can be installed by adding fast_rss to your list of dependencies in mix.exs:

def deps do
  [
    {:fast_rss, "~> 0.5.0"}
  ]
end

You also need the rust compiler installed: https://www.rust-lang.org/tools/install

curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh

Usage

There is only two functions, one for parsing rss parse_rss/1 and one for parsing atom feeds parse_atom/1 they takes a string and outputs an {:ok, map()} with string keys.

iex(1)>  {:ok, map_of_rss} = FastRSS.parse_rss("...rss_feed_string...")
iex(2)> Map.keys(map_of_rss)
["categories", "cloud", "copyright", "description", "docs", "dublin_core_ext",
 "extensions", "generator", "image", "items", "itunes_ext", "language",
 "last_build_date", "link", "managing_editor", "namespaces", "pub_date",
 "rating", "skip_days", "skip_hours", "syndication_ext", "text_input", "title",
 "ttl", "webmaster"]

The docs can be found at https://hexdocs.pm/fast_rss.

Supported Feeds

Reading from the following RSS versions is supported:

  • RSS 0.90
  • RSS 0.91
  • RSS 0.92
  • RSS 1.0
  • RSS 2.0
  • iTunes
  • Dublin Core
  • Atom

Benchmark

HTML: https://avencera.github.io/fast_rss/

Benchmark run from 2020-02-22 05:23:47.524699Z UTC

System

Benchmark suite executing on the following system:

Operating System macOS
CPU Information Intel(R) Core(TM) i9-9880H CPU @ 2.30GHz
Number of Available Cores 16
Available Memory 32 GB
Elixir Version 1.10.1
Erlang Version 22.2.6

Configuration

Benchmark suite executing with the following configuration:

:time 30 s
:parallel 1
:warmup 5 s

Statistics

Input: anxiety

Run Time

Name IPS Average Devitation Median 99th %
fast_rss 188.57 5.30 ms ±8.26% 5.45 ms 6.43 ms
feeder_ex 3.70 269.92 ms ±5.34% 268.12 ms 316.12 ms
feed_raptor 2.99 334.01 ms ±2.44% 331.03 ms 371.28 ms
elixir_feed_parser 1.94 515.72 ms ±1.94% 516.10 ms 536.05 ms
Comparison
Name IPS Slower
fast_rss 188.57  
feeder_ex 3.70 50.9x
feed_raptor 2.99 62.99x
elixir_feed_parser 1.94 97.25x
Memory Usage
Name Memory Factor
fast_rss 0.00156 MB  
feeder_ex 17.21 MB 11004.73x
feed_raptor 268.53 MB 171693.91x
elixir_feed_parser 313.30 MB 200316.09x

Input: ben

Run Time

Name IPS Average Devitation Median 99th %
fast_rss 83.95 11.91 ms ±10.29% 12.23 ms 16.17 ms
feeder_ex 13.33 75.04 ms ±4.38% 74.21 ms 89.72 ms
elixir_feed_parser 3.52 284.18 ms ±3.89% 283.83 ms 324.08 ms
feed_raptor 0.48 2078.76 ms ±0.52% 2076.27 ms 2097.44 ms
Comparison
Name IPS Slower
fast_rss 83.95  
feeder_ex 13.33 6.3x
elixir_feed_parser 3.52 23.86x
feed_raptor 0.48 174.51x
Memory Usage
Name Memory Factor
fast_rss 0.00155 MB  
feeder_ex 27.86 MB 17990.96x
elixir_feed_parser 163.88 MB 105811.88x
feed_raptor 1577.41 MB 1018492.36x

Input: daily

Run Time

Name IPS Average Devitation Median 99th %
fast_rss 32.98 0.0303 s ±7.62% 0.0313 s 0.0339 s
feeder_ex 4.94 0.20 s ±4.61% 0.199 s 0.24 s
elixir_feed_parser 0.64 1.57 s ±1.50% 1.57 s 1.63 s
feed_raptor 0.127 7.88 s ±0.23% 7.88 s 7.90 s
Comparison
Name IPS Slower
fast_rss 32.98  
feeder_ex 4.94 6.68x
elixir_feed_parser 0.64 51.86x
feed_raptor 0.127 259.91x
Memory Usage
Name Memory Factor
fast_rss 0.00153 MB  
feeder_ex 109.73 MB 71555.78x
elixir_feed_parser 880.51 MB 574178.95x
feed_raptor 6386.12 MB 4164382.64x

Input: dave

Run Time

Name IPS Average Devitation Median 99th %
fast_rss 407.08 2.46 ms ±9.83% 2.41 ms 3.16 ms
feeder_ex 56.52 17.69 ms ±6.14% 17.37 ms 22.51 ms
elixir_feed_parser 8.90 112.31 ms ±4.12% 111.93 ms 127.60 ms
feed_raptor 1.59 628.45 ms ±1.60% 626.71 ms 656.74 ms
Comparison
Name IPS Slower
fast_rss 407.08  
feeder_ex 56.52 7.2x
elixir_feed_parser 8.90 45.72x
feed_raptor 1.59 255.83x
Memory Usage
Name Memory Factor
fast_rss 0.00157 MB  
feeder_ex 9.25 MB 5886.17x
elixir_feed_parser 80.42 MB 51170.23x
feed_raptor 571.18 MB 363425.45x

Input: sleepy

Run Time

Name IPS Average Devitation Median 99th %
fast_rss 760.30 1.32 ms ±16.62% 1.21 ms 2.03 ms
feeder_ex 124.28 8.05 ms ±6.94% 8.03 ms 10.32 ms
elixir_feed_parser 26.26 38.09 ms ±5.08% 37.81 ms 44.42 ms
feed_raptor 3.21 311.16 ms ±2.85% 307.86 ms 345.09 ms
Comparison
Name IPS Slower
fast_rss 760.30  
feeder_ex 124.28 6.12x
elixir_feed_parser 26.26 28.96x
feed_raptor 3.21 236.57x
Memory Usage
Name Memory Factor
fast_rss 0.00157 MB  
feeder_ex 4.28 MB 2726.19x
elixir_feed_parser 35.88 MB 22829.92x
feed_raptor 274.98 MB 174963.99x

Input: stuff

Run Time

Name IPS Average Devitation Median 99th %
fast_rss 19.19 0.0521 s ±9.19% 0.0546 s 0.0635 s
feeder_ex 0.93 1.07 s ±2.49% 1.07 s 1.15 s
elixir_feed_parser 0.53 1.88 s ±1.22% 1.89 s 1.92 s
feed_raptor 0.0797 12.54 s ±1.61% 12.44 s 12.77 s
Comparison
Name IPS Slower
fast_rss 19.19  
feeder_ex 0.93 20.59x
elixir_feed_parser 0.53 36.11x
feed_raptor 0.0797 240.68x
Memory Usage
Name Memory Factor
fast_rss 0.00154 MB  
feeder_ex 140.58 MB 91220.55x
elixir_feed_parser 1018.78 MB 661058.28x
feed_raptor 8424.44 MB 5466379.81x

Deploying

Deploying rust NIFs can be a little bit annoying as you have to install the rust compiler. We try to alleviate this with rustler_precopmiled, which will create precompiled assets for a number of targets (see release.yml for the full list), but does not cover all environments. If you are having trouble deploying this package make an issue and I will try and help you out.

I will then add it to the FAQ below.

Q. How do I deploy using an Alpine Dockerfile?

A. I recommend using a multi-stage Dockerfile, and doing the following

  1. On the stages where you build all your deps, and build your release make sure to install build-base and libgcc:

    # This step installs all the build tools we'll need
    RUN apk update && \
        apk upgrade --no-cache && \
        apk add --no-cache \
        git \
        curl \
        build-base \
        libgcc  && \
        mix local.rebar --force && \
        mix local.hex --force
  2. Install the rust compiler and allow dynamic linking to the C library by setting the rust flag

    # install rustup
    RUN curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh -s -- -y
    ENV RUSTUP_HOME=/root/.rustup \
        RUSTFLAGS="-C target-feature=-crt-static" \
        CARGO_HOME=/root/.cargo  \
        PATH="/root/.cargo/bin:$PATH"
  3. On the stage where you actually run your elixir release install libgcc:

    ################################################################################
    ## STEP 4 - FINAL
    FROM alpine:3.11
    
    ENV MIX_ENV=prod
    
    RUN apk update && \
        apk add --no-cache \
        bash \
        libgcc \
        openssl-dev
    
    COPY --from=release-builder /opt/built /app
    WORKDIR /app
    CMD ["/app/my_app/bin/my_app", "start"]

License

FastRSS is released under the Apache License 2.0 - see the LICENSE file.