Skip to content

Commit

Permalink
Merge branch 'main' into handle-relative-urls-in-e
Browse files Browse the repository at this point in the history
  • Loading branch information
angelogladding authored Nov 30, 2023
2 parents d588f2e + ab580f6 commit a612e42
Show file tree
Hide file tree
Showing 7 changed files with 62 additions and 43 deletions.
12 changes: 6 additions & 6 deletions .github/workflows/tests.yml
Original file line number Diff line number Diff line change
Expand Up @@ -11,7 +11,7 @@ jobs:
build-macos:
strategy:
matrix:
python-version: ["3.8", "3.9", "3.10", "pypy3.8", "pypy3.9", "pypy3.10"]
python-version: ["3.8", "3.9", "3.10", "3.11", "3.12"]
runs-on: "macos-latest"
steps:
- name: Install md5sha1sum
Expand All @@ -26,7 +26,7 @@ jobs:
- name: Install Poetry
uses: snok/install-poetry@v1
with:
version: 1.2.2
version: 1.5.1
virtualenvs-in-project: true
- name: Install dependencies
run: poetry install --no-interaction --no-root
Expand All @@ -43,7 +43,7 @@ jobs:
build-linux:
strategy:
matrix:
python-version: ["3.8", "3.9", "3.10", "pypy3.8", "pypy3.9", "pypy3.10"]
python-version: ["3.8", "3.9", "3.10", "3.11", "3.12"]
runs-on: "ubuntu-latest"
steps:
- name: Install libxml2
Expand All @@ -60,7 +60,7 @@ jobs:
- name: Install Poetry
uses: snok/install-poetry@v1
with:
version: 1.2.2
version: 1.5.1
virtualenvs-in-project: true
- name: Install dependencies
run: poetry install --no-interaction --no-root
Expand All @@ -77,7 +77,7 @@ jobs:
build-windows:
strategy:
matrix:
python-version: ["3.8", "3.9", "3.10"]
python-version: ["3.8", "3.9", "3.10", "3.11", "3.12"]
runs-on: "windows-latest"
defaults:
run:
Expand All @@ -93,7 +93,7 @@ jobs:
- name: Install Poetry
uses: snok/install-poetry@v1
with:
version: 1.2.2
version: 1.5.1
virtualenvs-in-project: true
- name: Install dependencies
run: poetry install --no-interaction --no-root
Expand Down
14 changes: 7 additions & 7 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -22,7 +22,7 @@ All notable changes to this project will be documented in this file.
## 1.1.1 - 2018-06-15

- streamline backcompat to use JSON only.
- fix multiple mf1 root rel-tag parsing
- fix multiple mf1 root rel-tag parsing
- correct url and photo for hreview.
- add rules for nested hreview. update backcompat to use multiple matches in old properties.
- fix `rel-tag` to `p-category` conversion so that other classes are not lost.
Expand All @@ -37,11 +37,11 @@ All notable changes to this project will be documented in this file.
- better whitespace algorithm for `name` and `html.value` parsing
- experimental flag for including `alt` in `u-photo` parsing
- make a copy of the BeautifulSoup given by user to work on for parsing to prevent changes to original doc
- bump version to 1.1.1
- bump version to 1.1.1

## 1.1.0 - 2018-03-16

- bump version to 1.1.0 since it is a "major" change
- bump version to 1.1.0 since it is a "major" change
- added tests for new implied name rules
- modified earlier tests to accommodate new rules
- use space separator instead of "T"
Expand All @@ -59,12 +59,12 @@ All notable changes to this project will be documented in this file.
## 1.0.6 - 2018-03-04

- strip leading/trailing white space for `e-*[html]`. update the corresponding tests
- blank values explicitly authored are allowed as property values
- blank values explicitly authored are allowed as property values
- include `alt` or `src` from `<img>` in parsing for `p-*` and `e-*[value]`
- parse `title` from `<link>` for `p-*` resolves #84
- and `poster` from `<video>` for `u-*` resolves #76
- parse `title` from `<link>` for `p-*` resolves #84
- and `poster` from `<video>` for `u-*` resolves #76
- use `html5lib` as default parser
- use the final redirect URL resolves #62
- use the final redirect URL resolves #62
- update requirements to use BS4 v4.6.0 and html5lib v1.0.1
- drop support for Python 2.6 as html5lib dropped support

Expand Down
2 changes: 1 addition & 1 deletion CONTRIBUTING.md
Original file line number Diff line number Diff line change
Expand Up @@ -38,7 +38,7 @@ PRs must pass all tests and linting requirements before they can be merged.
Before you submit a PR to `mf2py`, run the following command in the base directory of the project:

```bash
make style
make lint
```

This will format your code using the linters configured with the project.
Expand Down
2 changes: 1 addition & 1 deletion mf2py/datetime_helpers.py
Original file line number Diff line number Diff line change
Expand Up @@ -58,5 +58,5 @@ def normalize_datetime(dtstr, match=None):

tzstr = match.group("tz")
if tzstr:
dtstr += tzstr
dtstr += tzstr.replace(":", "")
return dtstr
27 changes: 15 additions & 12 deletions mf2py/implied_properties.py
Original file line number Diff line number Diff line change
Expand Up @@ -103,15 +103,20 @@ def get_photo_child(children):
if not mf2_classes.root(poss_obj.get("class", [])):
return poss_obj

# if element is an img use source if exists
prop_value = get_img_src_alt(el, base_url)
if prop_value is not None:
def resolve_relative_url(prop_value):
if isinstance(prop_value, dict):
prop_value["value"] = try_urljoin(base_url, prop_value["value"])
else:
prop_value = try_urljoin(base_url, prop_value)
return prop_value

# if element is an img use source if exists
if prop_value := get_img_src_alt(el, base_url):
return resolve_relative_url(prop_value)

# if element is an object use data if exists
prop_value = get_attr(el, "data", check_name="object")
if prop_value is not None:
return prop_value
if prop_value := get_attr(el, "data", check_name="object"):
return resolve_relative_url(prop_value)

# find candidate child or grandchild
poss_child = None
Expand All @@ -131,14 +136,12 @@ def get_photo_child(children):
# if a possible child was found parse
if poss_child is not None:
# img get src
prop_value = get_img_src_alt(poss_child, base_url)
if prop_value is not None:
return prop_value
if prop_value := get_img_src_alt(poss_child, base_url):
return resolve_relative_url(prop_value)

# object get data
prop_value = get_attr(poss_child, "data", check_name="object")
if prop_value is not None:
return prop_value
if prop_value := get_attr(poss_child, "data", check_name="object"):
return resolve_relative_url(prop_value)


def url(el, base_url=""):
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
<html>
<base href="http://example.com">
<img class="h-card" alt="Jane Doe" src="jane-img.jpeg">
<object class="h-card" data="jane-object.jpeg">Jane Doe</object>
</html>
43 changes: 27 additions & 16 deletions test/test_parser.py
Original file line number Diff line number Diff line change
Expand Up @@ -136,11 +136,11 @@ def test_plain_child_microformat():

def test_datetime_parsing():
result = parse_fixture("datetimes.html")
assert result["items"][0]["properties"]["start"][0] == "2014-01-01T12:00:00+00:00"
assert result["items"][0]["properties"]["end"][0] == "3014-01-01T18:00:00+00:00"
assert result["items"][0]["properties"]["start"][0] == "2014-01-01T12:00:00+0000"
assert result["items"][0]["properties"]["end"][0] == "3014-01-01T18:00:00+0000"
assert result["items"][0]["properties"]["duration"][0] == "P1000Y"
assert result["items"][0]["properties"]["updated"][0] == "2011-08-26T00:01:21+00:00"
assert result["items"][0]["properties"]["updated"][1] == "2011-08-26T00:01:21+00:00"
assert result["items"][0]["properties"]["updated"][0] == "2011-08-26T00:01:21+0000"
assert result["items"][0]["properties"]["updated"][1] == "2011-08-26T00:01:21+0000"


def test_datetime_vcp_parsing():
Expand All @@ -149,19 +149,19 @@ def test_datetime_vcp_parsing():
assert result["items"][1]["properties"]["published"][0] == "3014-01-01 01:21Z"
assert result["items"][2]["properties"]["updated"][0] == "2014-03-11 09:55"
assert result["items"][3]["properties"]["published"][0] == "2014-01-30 15:28"
assert result["items"][4]["properties"]["published"][0] == "9999-01-14T11:52+08:00"
assert result["items"][5]["properties"]["published"][0] == "2014-06-01 12:30-06:00"
assert result["items"][8]["properties"]["start"][0] == "2014-06-01 12:30-06:00"
assert result["items"][9]["properties"]["start"][0] == "2014-06-01 12:30-06:00"
assert result["items"][10]["properties"]["start"][0] == "2014-06-01 00:30-06:00"
assert result["items"][4]["properties"]["published"][0] == "9999-01-14T11:52+0800"
assert result["items"][5]["properties"]["published"][0] == "2014-06-01 12:30-0600"
assert result["items"][8]["properties"]["start"][0] == "2014-06-01 12:30-0600"
assert result["items"][9]["properties"]["start"][0] == "2014-06-01 12:30-0600"
assert result["items"][10]["properties"]["start"][0] == "2014-06-01 00:30-0600"
assert result["items"][10]["properties"]["end"][0] == "2014-06-01 12:15"
assert result["items"][10]["properties"]["start"][1] == "2014-06-01 00:30-06:00"
assert result["items"][10]["properties"]["start"][1] == "2014-06-01 00:30-0600"
assert result["items"][10]["properties"]["end"][1] == "2014-06-01 12:15"
assert result["items"][11]["properties"]["start"][0] == "2016-03-02 00:30-06:00"
assert result["items"][12]["properties"]["start"][0] == "2014-06-01 12:30-6:00"
assert result["items"][13]["properties"]["start"][0] == "2014-06-01 12:30+6:00"
assert result["items"][11]["properties"]["start"][0] == "2016-03-02 00:30-0600"
assert result["items"][12]["properties"]["start"][0] == "2014-06-01 12:30-600"
assert result["items"][13]["properties"]["start"][0] == "2014-06-01 12:30+600"
assert result["items"][14]["properties"]["start"][0] == "2014-06-01 12:30Z"
assert result["items"][15]["properties"]["start"][0] == "2014-06-01 12:30-6:00"
assert result["items"][15]["properties"]["start"][0] == "2014-06-01 12:30-600"


def test_dt_end_implied_date():
Expand All @@ -175,8 +175,8 @@ def test_dt_end_implied_date():
assert event_wo_tz["properties"]["end"][0] == "2014-05-21 19:30"

event_w_tz = result["items"][7]
assert event_w_tz["properties"]["start"][0] == "2014-06-01 12:30-06:00"
assert event_w_tz["properties"]["end"][0] == "2014-06-01 19:30-06:00"
assert event_w_tz["properties"]["start"][0] == "2014-06-01 12:30-0600"
assert event_w_tz["properties"]["end"][0] == "2014-06-01 19:30-0600"


def test_embedded_parsing():
Expand Down Expand Up @@ -465,6 +465,17 @@ def test_implied_photo():
for i in range(12, 23):
assert "photo" not in result["items"][i]["properties"]

result = parse_fixture("implied_properties/implied_photo_relative_url.html")

assert (
result["items"][0]["properties"]["photo"][0]["value"]
== "http://example.com/jane-img.jpeg"
)
assert (
result["items"][1]["properties"]["photo"][0]
== "http://example.com/jane-object.jpeg"
)


def test_implied_url():
result = parse_fixture("implied_properties/implied_url.html")
Expand Down

0 comments on commit a612e42

Please sign in to comment.