In the aforementionned bug report, feed2exec crashes brutally (with a
backtrace, and not completely done) on the following feed:
http://www.agendadulibre.org/events.rss?region=12
The full backtrace is:
Traceback (most recent call last):
File "/usr/lib/python3/dist-packages/feedparser.py", line 3774, in _gen_georss_coords
t = [nxt(), nxt()][::swap and -1 or 1]
StopIteration
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/usr/bin/feed2exec", line 11, in <module>
load_entry_point('feed2exec==0.15.0', 'console_scripts', 'feed2exec')()
File "/usr/lib/python3/dist-packages/click/core.py", line 764, in __call__
return self.main(*args, **kwargs)
File "/usr/lib/python3/dist-packages/click/core.py", line 717, in main
rv = self.invoke(ctx)
File "/usr/lib/python3/dist-packages/click/core.py", line 1137, in invoke
return _process_result(sub_ctx.command.invoke(sub_ctx))
File "/usr/lib/python3/dist-packages/click/core.py", line 956, in invoke
return ctx.invoke(self.callback, **ctx.params)
File "/usr/lib/python3/dist-packages/click/core.py", line 555, in invoke
return callback(*args, **kwargs)
File "/usr/lib/python3/dist-packages/click/decorators.py", line 27, in new_func
return f(get_current_context().obj, *args, **kwargs)
File "/usr/lib/python3/dist-packages/feed2exec/__main__.py", line 124, in fetch
st.fetch(parallel, force=force, catchup=catchup)
File "/usr/lib/python3/dist-packages/feed2exec/feeds.py", line 162, in fetch
self.dispatch(feed, feed.parse(body), None, force)
File "/usr/lib/python3/dist-packages/feed2exec/feeds.py", line 399, in parse
data = feedparser.parse(body)
File "/usr/lib/python3/dist-packages/feedparser.py", line 3965, in parse
saxparser.parse(source)
File "/usr/lib/python3.7/xml/sax/expatreader.py", line 111, in parse
xmlreader.IncrementalParser.parse(self, source)
File "/usr/lib/python3.7/xml/sax/xmlreader.py", line 125, in parse
self.feed(buffer)
File "/usr/lib/python3.7/xml/sax/expatreader.py", line 217, in feed
self._parser.Parse(data, isFinal)
File "../Modules/pyexpat.c", line 471, in EndElement
File "/usr/lib/python3.7/xml/sax/expatreader.py", line 381, in end_element_ns
self._cont_handler.endElementNS(pair, None)
File "/usr/lib/python3/dist-packages/feedparser.py", line 2060, in endElementNS
self.unknown_endtag(localname)
File "/usr/lib/python3/dist-packages/feedparser.py", line 704, in unknown_endtag
method()
File "/usr/lib/python3/dist-packages/feedparser.py", line 1471, in _end_georss_point
geometry = _parse_georss_point(self.pop('geometry'))
File "/usr/lib/python3/dist-packages/feedparser.py", line 3783, in _parse_georss_point
coords = list(_gen_georss_coords(value, swap, dims))
RuntimeError: generator raised StopIteration
I can also reproduce this with plain feedparser with:
python3 -c 'import feedparser; feedparser.parse("http://www.agendadulibre.org/events.rss?region=12")'
So, to feed2exec's defense, this is purely feedparser's fault. Still,
play the defensive programming game and do not let feedparser failing
on a single feed crash the entire run. And even if it would, we should
still tell the user nicer things than a backtrace (although, to be
fair, maybwe we should tell *developers* the full backtrace).
The feed is, at the time of writing, valid according to this:
http://www.feedvalidator.org/check.cgi?url=http%3A%2F%2Fwww.agendadulibre.org%2Fevents.rss%3Fregion%3D12
I am not adding it to the test suite because it is not clear it is
legally allowed. According to this, the agendadulibre.org source code
is free software (AGPL):
https://www.agendadulibre.org/pages/infos#puis-je-utiliser-le-logiciel-de-lagenda-du-libre-pour-mon-agenda
... but this is more worrisome:
https://www.agendadulibre.org/pages/infos#a-nametraitementtraitement-des-donnes-personnellesa
In english, it says that people are allowed to request their personal
data to be taken out, which is a fair policy in terms of hosting an
agenda, but could be painful if I store that data in a git
repository.
So trust me: it's broken, and this fixes it, kind of.
↧