microformats2

A microformats2 parser (http://microformats.org/wiki/microformats-2) for Elixir


License
MIT

Documentation

Microformats2

Module Version Hex Docs Total Download License Last Updated

A Microformats2 parser for Elixir.

Installation

The package can be installed by adding :microformat2 to your list of dependencies in mix.exs:

def deps do
  [
    {:microformats2, "~> 1.0.0"}
  ]
end

If you want to directly parse from URLs, add :tesla to your list of dependencies in mix.exs:

def deps do
  [
    {:microformats2, "~> 1.0.0"},
    {:tesla, "~> 1.4.4"}
  ]
end

Usage

Give the parser an HTML string and the URL it was fetched from:

Microformats2.parse("""
<div class="h-card">
  <img class="u-photo" alt="photo of Mitchell"
        src="https://webfwd.org/content/about-experts/300.mitchellbaker/mentor_mbaker.jpg"/>
  <a class="p-name u-url"
      href="http://blog.lizardwrangler.com/">Mitchell Baker</a>
  (<a class="u-url" href="https://twitter.com/MitchellBaker">@MitchellBaker</a>)
  <span class="p-org">Mozilla Foundation</span>
  <p class="p-note">
    Mitchell is responsible for setting the direction and scope of the Mozilla Foundation and its activities.
  </p>
  <span class="p-category">Strategy</span>
  <span class="p-category">Leadership</span>
</div>
""", "http://example.org")

It will parse the object to a structure like that:

%{
  "items" => [
    %{
      "properties" => %{
        "category" => ["Strategy", "Leadership"],
        "name" => ["Mitchell Baker"],
        "note" => ["Mitchell is responsible for setting the direction and scope of the Mozilla Foundation and its activities."],
        "org" => ["Mozilla Foundation"],
        "photo" => [
          %{
            "alt" => "photo of Mitchell",
            "value" => "https://webfwd.org/content/about-experts/300.mitchellbaker/mentor_mbaker.jpg"
          }
        ],
        "url" => ["http://blog.lizardwrangler.com/",
         "https://twitter.com/MitchellBaker"]
      },
      "type" => ["h-card"]
    }
  ],
  "rel-urls" => %{},
  "rels" => %{}
}

You can also provide HTML trees already parsed with Floki:

Microformats2.parse(Floki.parse("<div class=\"h-card\">...</div>"), "http://example.org")

Or URLs if you have Tesla installed:

Microformats2.parse("http://example.org")

Dependencies

We need Floki for HTML parsing and optionally Tesla for fetching URLs.

Features

Implemented:

Not implemented:

Copyright and License

Copyright (c) 2018 Christian Kruse cjk@defunct.ch

This work is free. You can redistribute it and/or modify it under the terms of the MIT License. See the LICENSE.md file for more details.